"^" makes no difference in Python regex matching, but does in other regex testers

Question:

I have a regex pattern where I’m trying to match strings with the given format:

string1 = 'test_1.0.0_20220728_151206.log'

According to my regex helper app (Patterns on Mac), this regex matches the above:

pattern = '[a-z0-9]+_[0-9]+.[0-9]+.[0-9]+_[0-9]{8}_[0-9]{6}.log'

However, this pattern would also match the following string:

string2 = '_test_1.0.0_20220728_151206.log'

Since I don’t want this string matched, I modified it to add a ^ at the beginning of the regex which correctly matches the first string and not the second:

pattern = '^[a-z0-9]+_[0-9]+.[0-9]+.[0-9]+_[0-9]{8}_[0-9]{6}.log'

However, in Python, when I use re.match(pattern, string) using both patterns, string_1 is always matched and string_2 is never matched. This is the correct behavior that I would like, but I don’t understand why using the ^ would not make a difference in Python’s matching:

# String 1 matches both patterns
>>> re.match('[a-z0-9]+_[0-9]+.[0-9]+.[0-9]+_[0-9]{8}_[0-9]{6}.log', 
'test_1.0.0_20220728_151206.log')
<re.Match object; span=(0, 30), match='test_1.0.0_20220728_151206.log'>

>>> re.match('^[a-z0-9]+_[0-9]+.[0-9]+.[0-9]+_[0-9]{8}_[0-9]{6}.log', 
'test_1.0.0_20220728_151206.log')
<re.Match object; span=(0, 30), match='test_1.0.0_20220728_151206.log'>

# String 2 does not
>>> re.match('[a-z0-9]+_[0-9]+.[0-9]+.[0-9]+_[0-9]{8}_[0-9]{6}.log',         
'_test_1.0.0_20220728_151206.log')

>>> re.match('^[a-z0-9]+_[0-9]+.[0-9]+.[0-9]+_[0-9]{8}_[0-9]{6}.log',             
'_test_1.0.0_20220728_151206.log')

For anyone using the Patterns app, I use the "Default Flavor" and "Multi-line (^$)" is checked.

What am I missing here?

Asked By: Snuh

||

Answers:

To quote the docs:

Python offers two different primitive operations based on regular expressions: re.match() checks for a match only at the beginning of the string, while re.search() checks for a match anywhere in the string.

Answered By: isaactfa
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.