How to match my string? Do I need negative lookbehind?

Question:

Let’s assume we have the following string:

This thing costs $5000.

I’m trying to match up $5000 with negative lookbehind:

(?<!([:;]))$?([0-9]+)

So that it doesn’t find a match if it has ";" or ‘:’ behind $5000, eg. ;$5000 or ;5000.

First string:

This thing costs $5000
or
5000

Desired output:

$5000
or
5000

Second string:

This thing costs ;$5000
or
;5000

Desired output:

None

Asked By: Dirael

||

Answers:

Sometimes, matching what you do want rather than what you don’t want is easier. As far as I can tell, what you’re really looking for is an integer that optionally start with $. But, you still want to capture the $ if it’s there. The ; and : are just red-herrings.

import re

values = ['This thing costs $5000.',  # $5000
          'This thing costs ;$5000.', # None
          'This thing costs 5000.',   # 5000
          'This thing costs ;5000.',  # None
          'This thing costs8000']     # None

pattern = r'.*s($?d+)'

for value in values:
    # If a match is made, we want group 1 from that match.
    if match := re.match(pattern, value):
        print(match.group(1))
    else:
        print(match)

Output:

$5000
None
5000
None
None

See https://regexr.com/6sqbs for an in-depth explanation of my pattern .*s($?d+)

Answered By: BeRT2me

Your implementation is great but there’s a single flaw: you can match from the middle of the digits or from the $ sign.

Add d and $ to the negative lookback and it’ll work:

(?<![;:d$])$?([0-9]+)

Examples:

>>> re.findall("(?<![;:d$])$?([0-9]+)", "This thing costs ;$5000")
[]
>>> re.findall("(?<![;:d$])$?([0-9]+)", "This thing costs $5000")
['5000']

Keep in mind I do suggest matching the number instead of dealing with negative lookbacks like so:

re.findall(r"s$?([0-9]+)", "This thing costs ;$5000")
Answered By: Bharel
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.