python regex first/shortest match
Question:
trying to write an regex that can find all patterns
p = "q=http://.*.doc" in
text = "q=http://11111.doc,q=http://22222.doc"
when I did finall on text,
I get the whole thing, i.e. q=http://11111.doc,q=http://22222.doc
but not q=http://11111.doc
and q=http://22222.doc
how do I fix it?
Answers:
That’s because *
is a greedy quantifier, trying to match as much as it can. Make it *?
:
q=http://.*?.doc
More information can be found in the Regular Expression HOWTO:
Greedy versus Non-greedy
trying to write an regex that can find all patterns
p = "q=http://.*.doc" in
text = "q=http://11111.doc,q=http://22222.doc"
when I did finall on text,
I get the whole thing, i.e. q=http://11111.doc,q=http://22222.doc
but not q=http://11111.doc
and q=http://22222.doc
how do I fix it?
That’s because *
is a greedy quantifier, trying to match as much as it can. Make it *?
:
q=http://.*?.doc
More information can be found in the Regular Expression HOWTO:
Greedy versus Non-greedy