Regex sub not working as expected in python

Question:

I have the following string:

param = ' average(provider.cpuUtilization.Average) AS ECSCpuUtilization '

I have the following regex to say match as many white spaces before as and after as and the word after:

as_regex = r"s+ass+w+"

I have verified in a regex tester that this matches what I am looking for.

I do the following call:

new_param = re.sub(as_regex, '', param, re.IGNORECASE)

new_param is the same string as before though. It’s driving me crazy. Calling

re.search(as_regex, param, re.IGNORECASE)

returns the string AS ECSCpuUtilization exactly like I want. re.match does not but I don’t think that should matter because re.sub works the same as re.search if I’m not mistaken.

What am I overlooking here? Let me know if there’s any of clarity I can add.

Asked By: user3736114

||

Answers:

The issue is with the second argument of re.sub() function, which should be the replacement string. It is currently an empty string, which is causing the string to remain unchanged after the substitution.

Try changing it to the desired replacement string, for example:

new_param = re.sub(as_regex, ' NEW_STRING ', param, re.IGNORECASE)
Answered By: Apex

Set the flags with a keyword argument. Flags should actually be passed as the fifth positional argument, not the fourth one. See the re.sub documentation.

new_param = re.sub(as_regex, '', param, flags=re.IGNORECASE) # or re.I

Alternatively, you can use (?i) in the regular expression itself to ignore case.

as_regex = r"(?i)s+ass+w+"
new_param = re.sub(as_regex, '', param)
Answered By: Unmitigated
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.