Python — Regex match pattern OR end of string

Question:

import re
re.findall("(+?1?[ -.]?(?d{3})?[ -.]?d{3}[ -.]?d{4})(?:[ <$])", "+1.222.222.2222<")

The above code works fine if my string ends with a "<" or space. But if it’s the end of the string, it doesn’t work. How do I get +1.222.222.2222 to return in this condition:

import re
re.findall("(+?1?[ -.]?(?d{3})?[ -.]?d{3}[ -.]?d{4})(?:[ <$])", "+1.222.222.2222")

*I removed the "<" and just terminated the string. It returns none in this case. But I’d like it to return the full string — +1.222.222.2222

POSSIBLE ANSWER:

import re
re.findall("(+?1?[ -.]?(?d{3})?[ -.]?d{3}[ -.]?d{4})(?:[ <]|$)", "+1.222.222.2222")
Asked By: Gary

||

Answers:

I think you’ve solved the end-of-string issue, but there are a couple of other potential issues with the pattern in your question:

  • the - in [ -.] either needs to be escaped as - or placed in the first or last position within square brackets, e.g. [-. ] or [ .-]; if you search for [] in the docs here you’ll find the relevant info:
Ranges of characters can be indicated by giving two characters and separating them 
by a '-', for example [a-z] will match any lowercase ASCII letter, [0-5][0-9] will match
all the two-digits numbers from 00 to 59, and [0-9A-Fa-f] will match any hexadecimal
digit. If - is escaped (e.g. [a-z]) or if it’s placed as the first or last character
(e.g. [-a] or [a-]), it will match a literal '-'.
  • you may want to require that either matching parentheses or none are present around the first 3 of 10 digits using (?:(d{3}) ?|d{3}[-. ]?)

Here’s a possible tweak incorporating the above

import re
pat = "^((?:+1[-. ]?|1[-. ]?)?(?:(d{3}) ?|d{3}[-. ]?)d{3}[-. ]?d{4})(?:[ <]|$)"
print( re.findall(pat, "+1.222.222.2222") )
print( re.findall(pat, "+1(222)222.2222") )
print( re.findall(pat, "+1(222.222.2222") )

Output:

['+1.222.222.2222']
['+1(222)222.2222']
[]
Answered By: constantstranger

Maybe try:

import re
re.findall("(+?1?[ -.]?(?d{3})?[ -.]?d{3}[ -.]?d{4})(?:| |<|$)", "+1.222.222.2222")
  • null matches any position, +1.222.222.2222
  • matches space character, +1.222.222.2222
  • < matches less-than sign character, +1.222.222.2222<
  • $ end of line, +1.222.222.2222

You can also use regex101 for easier debugging.

Answered By: ItsLame
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.