Python: regex pattern fails after being split into multiple lines
Question:
I have a regex pattern that works fine if I write it in a single line.
For instance, the pattern work if I do the following:
MAIN_TREE_START_STRING_PATTERN = (
r"*{3}s+bTotal Thingsb:s+(?P<number_of_things>d+)"
)
compiled_pattern = re.compile(MAIN_TREE_START_STRING_PATTERN, flags=re.X)
match = compiled_pattern.match(string="*** Total Things: 348, abcdefghi ***")
if match:
print("Success")
else:
print("Failed")
But if I changed the regex pattern to be a multiline string, and using the VERBOSE
flag, it doesn’t work.
MAIN_TREE_START_STRING_PATTERN = r"""
*{3}s+bTotal Thingsb:s+
(?P<number_of_things>d+)
"""
compiled_pattern = re.compile(MAIN_TREE_START_STRING_PATTERN)
match = compiled_pattern.match(string="*** Total Things: 348, abcdefghi ***")
if match:
print("Success")
else:
print("Failed")
I’m not sure what I’m doing wrong during the multiline pattern declaration.
Answers:
Line breaks inside multiline string literals are also part of the resulting string. The additional line breaks are most probably the reason why you regular expression no longer works.
To get rid of the line breaks, you can simply define multiple strings that get automatically concatenated by Python:
MAIN_TREE_START_STRING_PATTERN = (
r"*{3}s+bTotal Thingsb:s+"
r"(?P<number_of_things>d+)"
)
You are not using re.VERBOSE and you have to match the space explicitly in that case
import re
MAIN_TREE_START_STRING_PATTERN = r"""
*{3}s+bTotal[ ]Thingsb:s+
(?P<number_of_things>d+)
"""
compiled_pattern = re.compile(MAIN_TREE_START_STRING_PATTERN, re.VERBOSE)
match = compiled_pattern.match(string="*** Total Things: 348, abcdefghi ***")
if match:
print("Success")
else:
print("Failed")
I have a regex pattern that works fine if I write it in a single line.
For instance, the pattern work if I do the following:
MAIN_TREE_START_STRING_PATTERN = (
r"*{3}s+bTotal Thingsb:s+(?P<number_of_things>d+)"
)
compiled_pattern = re.compile(MAIN_TREE_START_STRING_PATTERN, flags=re.X)
match = compiled_pattern.match(string="*** Total Things: 348, abcdefghi ***")
if match:
print("Success")
else:
print("Failed")
But if I changed the regex pattern to be a multiline string, and using the VERBOSE
flag, it doesn’t work.
MAIN_TREE_START_STRING_PATTERN = r"""
*{3}s+bTotal Thingsb:s+
(?P<number_of_things>d+)
"""
compiled_pattern = re.compile(MAIN_TREE_START_STRING_PATTERN)
match = compiled_pattern.match(string="*** Total Things: 348, abcdefghi ***")
if match:
print("Success")
else:
print("Failed")
I’m not sure what I’m doing wrong during the multiline pattern declaration.
Line breaks inside multiline string literals are also part of the resulting string. The additional line breaks are most probably the reason why you regular expression no longer works.
To get rid of the line breaks, you can simply define multiple strings that get automatically concatenated by Python:
MAIN_TREE_START_STRING_PATTERN = (
r"*{3}s+bTotal Thingsb:s+"
r"(?P<number_of_things>d+)"
)
You are not using re.VERBOSE and you have to match the space explicitly in that case
import re
MAIN_TREE_START_STRING_PATTERN = r"""
*{3}s+bTotal[ ]Thingsb:s+
(?P<number_of_things>d+)
"""
compiled_pattern = re.compile(MAIN_TREE_START_STRING_PATTERN, re.VERBOSE)
match = compiled_pattern.match(string="*** Total Things: 348, abcdefghi ***")
if match:
print("Success")
else:
print("Failed")