Regex: capture specific string with conditions
Question:
Im trying to just capture the following string: u00
because i need to replace it to u00
.
Sometimes this characters appear with a
before, in that case, i don’t want to capture it.
At other times, the simbol is "
, i want to capture it, but just the u00
, not "u00
Im trying this:
file_modified = re.sub(r'[^\|^s](u00)', r'\u00', original_file)
Im capturing the "
and i don’t know how to skip it, i just want to capture u00
Answers:
Use a negative lookbehind assertion r'(?<!\)(u00)'
. The will match u00
provided it is not preceded by
.
Just match it optionally:
file_modified = re.sub(r'\?u00', r'\u00', original_file)
Here,
\?u00
– matches an optional
and u00
\u00
– is a replacement pattern that replaces with u00
Thus, even if there was a
before u00
, it won’t disappear and won’t get doubled, but if it was missing, it will be added.
See the Python demo:
import re
original_file = r"u00 because i need to replace it to u00"
print(re.sub(r'\?u00', r'\u00', original_file))
# => u00 because i need to replace it to u00
Im trying to just capture the following string: u00
because i need to replace it to u00
.
Sometimes this characters appear with a before, in that case, i don’t want to capture it.
At other times, the simbol is "
, i want to capture it, but just the u00
, not "u00
Im trying this:
file_modified = re.sub(r'[^\|^s](u00)', r'\u00', original_file)
Im capturing the "
and i don’t know how to skip it, i just want to capture u00
Use a negative lookbehind assertion r'(?<!\)(u00)'
. The will match u00
provided it is not preceded by .
Just match it optionally:
file_modified = re.sub(r'\?u00', r'\u00', original_file)
Here,
\?u00
– matches an optionaland
u00
\u00
– is a replacement pattern that replaces withu00
Thus, even if there was a before
u00
, it won’t disappear and won’t get doubled, but if it was missing, it will be added.
See the Python demo:
import re
original_file = r"u00 because i need to replace it to u00"
print(re.sub(r'\?u00', r'\u00', original_file))
# => u00 because i need to replace it to u00