Regex sub not working as expected in python
Question:
I have the following string:
param = ' average(provider.cpuUtilization.Average) AS ECSCpuUtilization '
I have the following regex to say match as many white spaces before as and after as and the word after:
as_regex = r"s+ass+w+"
I have verified in a regex tester that this matches what I am looking for.
I do the following call:
new_param = re.sub(as_regex, '', param, re.IGNORECASE)
new_param is the same string as before though. It’s driving me crazy. Calling
re.search(as_regex, param, re.IGNORECASE)
returns the string AS ECSCpuUtilization
exactly like I want. re.match does not but I don’t think that should matter because re.sub works the same as re.search if I’m not mistaken.
What am I overlooking here? Let me know if there’s any of clarity I can add.
Answers:
The issue is with the second argument of re.sub()
function, which should be the replacement string. It is currently an empty string, which is causing the string to remain unchanged after the substitution.
Try changing it to the desired replacement string, for example:
new_param = re.sub(as_regex, ' NEW_STRING ', param, re.IGNORECASE)
Set the flags with a keyword argument. Flags should actually be passed as the fifth positional argument, not the fourth one. See the re.sub
documentation.
new_param = re.sub(as_regex, '', param, flags=re.IGNORECASE) # or re.I
Alternatively, you can use (?i)
in the regular expression itself to ignore case.
as_regex = r"(?i)s+ass+w+"
new_param = re.sub(as_regex, '', param)
I have the following string:
param = ' average(provider.cpuUtilization.Average) AS ECSCpuUtilization '
I have the following regex to say match as many white spaces before as and after as and the word after:
as_regex = r"s+ass+w+"
I have verified in a regex tester that this matches what I am looking for.
I do the following call:
new_param = re.sub(as_regex, '', param, re.IGNORECASE)
new_param is the same string as before though. It’s driving me crazy. Calling
re.search(as_regex, param, re.IGNORECASE)
returns the string AS ECSCpuUtilization
exactly like I want. re.match does not but I don’t think that should matter because re.sub works the same as re.search if I’m not mistaken.
What am I overlooking here? Let me know if there’s any of clarity I can add.
The issue is with the second argument of re.sub()
function, which should be the replacement string. It is currently an empty string, which is causing the string to remain unchanged after the substitution.
Try changing it to the desired replacement string, for example:
new_param = re.sub(as_regex, ' NEW_STRING ', param, re.IGNORECASE)
Set the flags with a keyword argument. Flags should actually be passed as the fifth positional argument, not the fourth one. See the re.sub
documentation.
new_param = re.sub(as_regex, '', param, flags=re.IGNORECASE) # or re.I
Alternatively, you can use (?i)
in the regular expression itself to ignore case.
as_regex = r"(?i)s+ass+w+"
new_param = re.sub(as_regex, '', param)