Is calling str.replace() twice the best solution for overlapping matches?
Question:
When I execute the following code I expect all ‘ a ‘ to be replaced by ‘ b ‘ yet only non overlapping matches are replaced.
" a a a a a a a a ".replace(' a ', ' b ')
>>>' b a b a b a b a'
So I use the following:
" a a a a a a a a ".replace(' a ', ' b ').replace(' a ', ' b ')
>>>' b b b b b b b b '
Is this a bug or a feature of replace ?
From the docs ALL OCCURENCES are replaced.
str.replace(old, new[, count])
Return a copy of the string with all occurrences of substring old replaced by new. If the optional argument count is given, only the first count occurrences are replaced.
Answers:
Most likely your best bet is using regex. Lookbehind/lookahead expressions let you match part of a string surrounded by a specific expression.
import re
s = " a a a a a a a a "
pattern = r'(?<= )a(?= )'
print(re.sub(pattern, "b", s))
Spaces don’t actually become part of the match, so they don’t get replaced.
why not just replace only the thing you want to replace that is only ‘a’ and not ‘ a ‘ like this
" a a a a a a a a ".replace('a', 'b')
which gives the output
' b b b b b b b b '
When I execute the following code I expect all ‘ a ‘ to be replaced by ‘ b ‘ yet only non overlapping matches are replaced.
" a a a a a a a a ".replace(' a ', ' b ')
>>>' b a b a b a b a'
So I use the following:
" a a a a a a a a ".replace(' a ', ' b ').replace(' a ', ' b ')
>>>' b b b b b b b b '
Is this a bug or a feature of replace ?
From the docs ALL OCCURENCES are replaced.
str.replace(old, new[, count])
Return a copy of the string with all occurrences of substring old replaced by new. If the optional argument count is given, only the first count occurrences are replaced.
Most likely your best bet is using regex. Lookbehind/lookahead expressions let you match part of a string surrounded by a specific expression.
import re
s = " a a a a a a a a "
pattern = r'(?<= )a(?= )'
print(re.sub(pattern, "b", s))
Spaces don’t actually become part of the match, so they don’t get replaced.
why not just replace only the thing you want to replace that is only ‘a’ and not ‘ a ‘ like this
" a a a a a a a a ".replace('a', 'b')
which gives the output
' b b b b b b b b '