Python regex doesn't match when adding additional text around pattern and text

Question:

So I’m trying to match "Python 3.11.4 (64-bit) Setup" like so:

re.match(r"Python (d.)+d (64-bit) Setup", "Python 3.11.4 (64-bit) Setup")

However, for some reason, it doesn’t work. But, when I try

re.match(r"(d.)+d", "3.11.4")

it matches perfectly well. How do I fix this?

P.S.:
My end goal is actually to match it with this pattern:

^Python (d.)+d( (d+-bit))? Setup$

Hi, so I see that this question has been marked as a duplicate of "What special characters must be escaped in regular expressions?". However, in my opinion, this is invalid. Simply because the problem is related to special characters does not mean that the questions are "duplicates". Although the answers may be similar, it stands to reason that the two questions are unique. I am asking about a specific problem in my regex. However, the other question is solely inquiring about special characters of regexes.

Going to the provided question "duplicate" would not help me at all, at the time, in determining any issue in my regex, when I had no idea what the problem was. I believe it is unfair to state answers from another question as evidence for a "duplicate" after the answer is well known when in the past, I was clueless on what to even search up to attempt to find answers to my question.

From my perspective, the only logical course of action was to post a completely new question on StackOverflow in order to find answers. I couldn’t have known that there had been a "similar" question beforehand.

As quoted in this answer referencing a StackOverflow blog post,

There could be hundreds of different, related, perfectly valid questions on the same topic. There is no One True Question.

It’s rarely this straightforward, however — usually there are two similar but not-quite-the-same questions, both of which have value for different reasons.

My question falls under this group of questions: two similar but not identical questions.

I do believe this question provides value to the overall StackOverflow community by providing answers to potential questions programmers may have.

Please refer to any additional links that explain how to determine whether a question is a "duplicate" or not. Please refrain from disliking until you provide an explanation. Any likes undoing the current negative score situation would be well appreciated. Many thanks to the one who liked my post and turned it from a -2 to a -1. I would not like to lose my reputation over a trivial matter.

Asked By: CrazyVideoGamer

||

Answers:

Two mistakes:

  • parens (and the dot) are not escaped
  • (d.) doesn’t match e.g. 11.

Working version:

r"Python (d+.)+d (64-bit) Setup"
Answered By: gog

"… However, for some reason, it doesn’t work. But, when I try"

re.match(r"(d.)+d", "3.11.4")

"it matches perfectly well. How do I fix this? …"

It didn’t match the entire string, it just found a match.

match = re.match(r'(d.)+d', '3.11.4')
print(match.group())

Output

3.1

The d syntax will match a single digit, 0 through 9.

To match a sequence of digits, append the + quantifier—d+.
Additionally, it’s good practice to escape the dot-character, which will match any character.

match = re.match(r'(d+.)+d+', '3.11.4')
print(match.group())

Output

3.11.4

So, you’ll have to re-factor your pattern, to account for this text, as you are using the ^ and $ syntax, to match the start-of and end-of a line.

^Python (d+.)+d+( (d+-bit))? Setup$

Here is an example.

string = 'Python 3.11.4 (64-bit) Setup'
match = re.match(r'^Python (d+.)+d+( (d+-bit))? Setup$', string)
print(match.group(1))
print(match.group(2))

Output

11.
 (64-bit)
Answered By: Reilas
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.