Need help to understand the star quantifier (*) output
Question:
I am able to understand output of the below command:
import re
text = "streets2345"
pattern = r"d+"
match = re.search(pattern, text)
print(match.group(0))
Output: 2345
However, I am not able to understand why the below code is returning null
.
import re
text = "streets2345"
pattern = r"d*"
match = re.search(pattern, text)
print(match.group(0))
Output: null
Here, the first character s
of the text matches the pattern d*
.
So, why the output is not s
instead of null
?
Answers:
d*
will match 0 or more digits. ‘s’ is not a digit, but it will match the position before the ‘s’ as there’s 0 digits. Thus the first group will be null (empty). In fact, the first 7 groups will be null because of the same reason, the last one being the position before the last ‘s’ in "streets". The 8th group (index 7) will be "2345".
d+
will match 1 or more digits. As you don’t have a digit before the first ‘s’ (again, there’s 0 digits), you won’t get a match in there in this case.
If d*
didn’t match the empty 0-digit positions before each letter, what would be the difference of d*
and d+
?
I am able to understand output of the below command:
import re
text = "streets2345"
pattern = r"d+"
match = re.search(pattern, text)
print(match.group(0))
Output: 2345
However, I am not able to understand why the below code is returning null
.
import re
text = "streets2345"
pattern = r"d*"
match = re.search(pattern, text)
print(match.group(0))
Output: null
Here, the first character s
of the text matches the pattern d*
.
So, why the output is not s
instead of null
?
d*
will match 0 or more digits. ‘s’ is not a digit, but it will match the position before the ‘s’ as there’s 0 digits. Thus the first group will be null (empty). In fact, the first 7 groups will be null because of the same reason, the last one being the position before the last ‘s’ in "streets". The 8th group (index 7) will be "2345".
d+
will match 1 or more digits. As you don’t have a digit before the first ‘s’ (again, there’s 0 digits), you won’t get a match in there in this case.
If d*
didn’t match the empty 0-digit positions before each letter, what would be the difference of d*
and d+
?