Checking whole string with a regex

Question:

I’m trying to check if a string is a number, so the regex “d+” seemed good. However that regex also fits “78.46.92.168:8000” for some reason, which I do not want, a little bit of code:

class Foo():
    _rex = re.compile("d+")
    def bar(self, string):
         m = _rex.match(string)
         if m != None:
             doStuff()

And doStuff() is called when the ip adress is entered. I’m kind of confused, how does “.” or “:” match “d”?

Asked By: dutt

||

Answers:

Change it from d+ to ^d+$

Answered By: prostynick

d+ matches any positive number of digits within your string, so it matches the first 78 and succeeds.

Use ^d+$.

Or, even better: "78.46.92.168:8000".isdigit()

Answered By: eumiro

re.match() always matches from the start of the string (unlike re.search()) but allows the match to end before the end of the string.

Therefore, you need an anchor: _rex.match(r"d+$") would work.

To be more explicit, you could also use _rex.match(r"^d+$") (which is redundant) or just drop re.match() altogether and just use _rex.search(r"^d+$").

Answered By: Tim Pietzcker

Z matches the end of the string while $ matches the end of the string or just before the newline at the end of the string, and exhibits different behaviour in re.MULTILINE. See the syntax documentation for detailed information.

>>> s="1234n"
>>> re.search("^d+Z",s)
>>> s="1234"
>>> re.search("^d+Z",s)
<_sre.SRE_Match object at 0xb762ed40>
Answered By: ghostdog74

There are a couple of options in Python to match an entire input with a regex.

Python 2 and 3

In Python 2 and 3, you may use

re.match(r'd+$') # re.match anchors the match at the start of the string, so $ is what remains to add

or – to avoid matching before the final n in the string:

re.match(r'd+Z') # Z will only match at the very end of the string

Or the same as above with re.search method requiring the use of ^ / A start-of-string anchor as it does not anchor the match at the start of the string:

re.search(r'^d+$')
re.search(r'Ad+Z')

Note that A is an unambiguous string start anchor, its behavior cannot be redefined with any modifiers (re.M / re.MULTILINE can only redefine the ^ and $ behavior).

Python 3

All those cases described in the above section and one more useful method, re.fullmatch (also present in the PyPi regex module):

If the whole string matches the regular expression pattern, return a corresponding match object. Return None if the string does not match the pattern; note that this is different from a zero-length match.

So, after you compile the regex, just use the appropriate method:

_rex = re.compile("d+")
if _rex.fullmatch(s):
    doStuff()
Answered By: Wiktor Stribiżew
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.