Python regex to remove string which may contain additional character

Question:

I’ve got a string in python that sometimes starts with either {txt - or {txt.

These do not always appear at the start of the string, but if they do, I want to remove them.

I know I can do it like this:

string = string.strip('{txt -').strip('{txt')

But I’m thinking there is surely a better solution (maybe using regex). Is it possible to add a potential extra character to a regex (in this case -)?

Asked By: nimgwfc

||

Answers:

You can use re.sub with ( -)? for an optional space and hyphen.

re.sub('^{txt( -)?', '', string)

Note that strip does not work like you think it does. For instance, "t".strip("{txt") produces an empty string.

Answered By: Unmitigated

Maybe you need some function to only return the word and strip/erase any other character in your string:

You could try something like this:

def remove_special_chars(text, remove_digits_bool=False):

    if remove_digits:
        text = re.sub("[^a-zA-Z ]", '', text)
    else:
        text = re.sub("[^a-zA-Z0-9 ]", '', text)
    

    return text
Answered By: fCremer
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.