Python regex to remove string which may contain additional character
Question:
I’ve got a string in python that sometimes starts with either {txt -
or {txt
.
These do not always appear at the start of the string, but if they do, I want to remove them.
I know I can do it like this:
string = string.strip('{txt -').strip('{txt')
But I’m thinking there is surely a better solution (maybe using regex). Is it possible to add a potential extra character to a regex (in this case -
)?
Answers:
You can use re.sub
with ( -)?
for an optional space and hyphen.
re.sub('^{txt( -)?', '', string)
Note that strip
does not work like you think it does. For instance, "t".strip("{txt")
produces an empty string.
Maybe you need some function to only return the word and strip/erase any other character in your string:
You could try something like this:
def remove_special_chars(text, remove_digits_bool=False):
if remove_digits:
text = re.sub("[^a-zA-Z ]", '', text)
else:
text = re.sub("[^a-zA-Z0-9 ]", '', text)
return text
I’ve got a string in python that sometimes starts with either {txt -
or {txt
.
These do not always appear at the start of the string, but if they do, I want to remove them.
I know I can do it like this:
string = string.strip('{txt -').strip('{txt')
But I’m thinking there is surely a better solution (maybe using regex). Is it possible to add a potential extra character to a regex (in this case -
)?
You can use re.sub
with ( -)?
for an optional space and hyphen.
re.sub('^{txt( -)?', '', string)
Note that strip
does not work like you think it does. For instance, "t".strip("{txt")
produces an empty string.
Maybe you need some function to only return the word and strip/erase any other character in your string:
You could try something like this:
def remove_special_chars(text, remove_digits_bool=False):
if remove_digits:
text = re.sub("[^a-zA-Z ]", '', text)
else:
text = re.sub("[^a-zA-Z0-9 ]", '', text)
return text