Regex Negative Lookbehind works in PCRE but not in Python

Question:

The pattern (?<!(asp|php|jsp))?.* works in PCRE, but it doesn’t work in Python.

So what can I do to get this regex working in Python? (Python 2.7)

Asked By: Matt Elson

||

Answers:

It works perfectly fine for me. Are you maybe using it wrong? Make sure to use re.search instead of re.match:

>>> import re
>>> s = 'somestring.asp?1=123'
>>> re.search(r"(?<!(asp|php|jsp))?.*", s)
>>> s = 'somestring.xml?1=123'
>>> re.search(r"(?<!(asp|php|jsp))?.*", s)
<_sre.SRE_Match object at 0x0000000002DCB098>

Which is exactly how your pattern should behave. As glglgl mentioned, you can get the match if you assign that Match object to a variable (say m) and then call m.group(). That yields ?1=123.

By the way, you can leave out the inner parentheses. This pattern is equivalent:

(?<!asp|php|jsp)?.*
Answered By: Martin Ender