Why do these two regular expressions work differently with re.sub(), but return the same match with re.search()?
Question:
Both regex are fetching same match but why does ",([d-]+)"
work as expected and not the "(,d*-?d*-?d*)"
? I was expecting both regex to give me same output during re.sub()
. What am I missing?
>>> print(re.search(r",([d-]+)", "Sabrina Green,802-867-5309,System Administrator"))
<re.Match object; span=(13, 26), match=',802-867-5309'>
>>> print(re.search(r"(,d*-?d*-?d*)", "Sabrina Green,802-867-5309,System Administrator"))
<re.Match object; span=(13, 26), match=',802-867-5309'>
>>>
>>> print(re.sub(r",([d-]+)", r",+1-1", "Sabrina Green,802-867-5309,System Administrator"))
Sabrina Green,+1-802-867-5309,System Administrator
>>> print(re.sub(r"(,d*-?d*-?d*)", r",+1-1", "Sabrina Green,802-867-5309,System Administrator"))
Sabrina Green,+1-,802-867-5309,+1-,System Administrator
>>>
Expected output: Sabrina Green,+1-802-867-5309,System Administrator
Answers:
You have a typo in your regex string.
First, your comma is inside the parethesis, thus ,+1-,802-867-5309
Second, you should replace the *
with +
in your regex. Notice the 5309,+1-,System Administrator
In the future if you’re having trouble with regex. You can check out this site. It will break down the regex and give you a visual representation of what you’re regex is doing.
Both regex are fetching same match but why does ",([d-]+)"
work as expected and not the "(,d*-?d*-?d*)"
? I was expecting both regex to give me same output during re.sub()
. What am I missing?
>>> print(re.search(r",([d-]+)", "Sabrina Green,802-867-5309,System Administrator"))
<re.Match object; span=(13, 26), match=',802-867-5309'>
>>> print(re.search(r"(,d*-?d*-?d*)", "Sabrina Green,802-867-5309,System Administrator"))
<re.Match object; span=(13, 26), match=',802-867-5309'>
>>>
>>> print(re.sub(r",([d-]+)", r",+1-1", "Sabrina Green,802-867-5309,System Administrator"))
Sabrina Green,+1-802-867-5309,System Administrator
>>> print(re.sub(r"(,d*-?d*-?d*)", r",+1-1", "Sabrina Green,802-867-5309,System Administrator"))
Sabrina Green,+1-,802-867-5309,+1-,System Administrator
>>>
Expected output: Sabrina Green,+1-802-867-5309,System Administrator
You have a typo in your regex string.
First, your comma is inside the parethesis, thus ,+1-,802-867-5309
Second, you should replace the *
with +
in your regex. Notice the 5309,+1-,System Administrator
In the future if you’re having trouble with regex. You can check out this site. It will break down the regex and give you a visual representation of what you’re regex is doing.