Python: regex pattern fails after being split into multiple lines

Question:

I have a regex pattern that works fine if I write it in a single line.

For instance, the pattern work if I do the following:

MAIN_TREE_START_STRING_PATTERN = (
    r"*{3}s+bTotal Thingsb:s+(?P<number_of_things>d+)"
)
compiled_pattern = re.compile(MAIN_TREE_START_STRING_PATTERN, flags=re.X)
match = compiled_pattern.match(string="*** Total Things: 348, abcdefghi ***")

if match:
    print("Success")
else:
    print("Failed")

But if I changed the regex pattern to be a multiline string, and using the VERBOSE flag, it doesn’t work.

MAIN_TREE_START_STRING_PATTERN = r"""
*{3}s+bTotal Thingsb:s+
(?P<number_of_things>d+)
"""
compiled_pattern = re.compile(MAIN_TREE_START_STRING_PATTERN)
match = compiled_pattern.match(string="*** Total Things: 348, abcdefghi ***")

if match:
    print("Success")
else:
    print("Failed")

I’m not sure what I’m doing wrong during the multiline pattern declaration.

Asked By: Jacobo

||

Answers:

Line breaks inside multiline string literals are also part of the resulting string. The additional line breaks are most probably the reason why you regular expression no longer works.

To get rid of the line breaks, you can simply define multiple strings that get automatically concatenated by Python:

MAIN_TREE_START_STRING_PATTERN = (
    r"*{3}s+bTotal Thingsb:s+"
    r"(?P<number_of_things>d+)"
)
Answered By: DarkPlayer

You are not using re.VERBOSE and you have to match the space explicitly in that case

import re

MAIN_TREE_START_STRING_PATTERN = r"""
*{3}s+bTotal[ ]Thingsb:s+
(?P<number_of_things>d+)
"""
compiled_pattern = re.compile(MAIN_TREE_START_STRING_PATTERN, re.VERBOSE)
match = compiled_pattern.match(string="*** Total Things: 348, abcdefghi ***")

if match:
    print("Success")
else:
    print("Failed")
Answered By: The fourth bird
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.