Regex: capture specific string with conditions

Question:

Im trying to just capture the following string: u00 because i need to replace it to u00.

Sometimes this characters appear with a before, in that case, i don’t want to capture it.
At other times, the simbol is ", i want to capture it, but just the u00, not "u00

Im trying this:

file_modified = re.sub(r'[^\|^s](u00)', r'\u00', original_file)

Im capturing the " and i don’t know how to skip it, i just want to capture u00

Asked By: jabeono

||

Answers:

Use a negative lookbehind assertion r'(?<!\)(u00)'. The will match u00 provided it is not preceded by .

Answered By: craigb

Just match it optionally:

file_modified = re.sub(r'\?u00', r'\u00', original_file)

Here,

  • \?u00 – matches an optional and u00
  • \u00 – is a replacement pattern that replaces with u00

Thus, even if there was a before u00, it won’t disappear and won’t get doubled, but if it was missing, it will be added.

See the Python demo:

import re
original_file = r"u00 because i need to replace it to u00"
print(re.sub(r'\?u00', r'\u00', original_file))
# => u00 because i need to replace it to u00
Answered By: Wiktor Stribiżew
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.