Why python regular expression asterisk * does't match the last character

Question:

I just start to learn regex and can not understand the following.

import re

s="abc"

print(re.search("c*",s))

The result is a match object that does’t match anything.
<re.Match object; span=(0, 0), match=''>

I know if I add $ it’ll work ,like c*$.But * should be greedy and match the last character even without $.

Can someone explain to me why it does’t?

Asked By: qiu

||

Answers:

You seem to think that c* is meaning "match a c, and then any character", but that would be a wildmatch (glob) not a regex pattern. In regex, the * is a modifier on the c, meaning "match 0 or more c". So the first match in "abc" that you’re getting is actually a zero-width match at the start of the string:

 abc
^
|
|

That is what the span=(0, 0) shown in the match repr is trying to indicate.

There are 4 other matches in this string:

>>> re.findall("c*", "abc")
['', '', 'c', '']

However, re.search will only return the first one (if any).

Answered By: wim
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.