python regex: get end digits from a string

Question:

I am quite new to python and regex (regex newbie here), and I have the following simple string:

s=r"""99-my-name-is-John-Smith-6376827-%^-1-2-767980716"""

I would like to extract only the last digits in the above string i.e 767980716 and I was wondering how I could achieve this using python regex.

I wanted to do something similar along the lines of:

re.compile(r"""-(.*?)""").search(str(s)).group(1)

indicating that I want to find the stuff in between (.*?) which starts with a “-” and ends at the end of string – but this returns nothing..

I was wondering if anyone could point me in the right direction..
Thanks.

Asked By: JohnJ

||

Answers:

Use the below regex

d+$

$ depicts the end of string..

d is a digit

+ matches the preceding character 1 to many times

Answered By: Anirudha

Your Regex should be (d+)$.

  • d+ is used to match digit (one or more)
  • $ is used to match at the end of string.

So, your code should be: –

>>> s = "99-my-name-is-John-Smith-6376827-%^-1-2-767980716"
>>> import re
>>> re.compile(r'(d+)$').search(s).group(1)
'767980716'

And you don’t need to use str function here, as s is already a string.

Answered By: Rohit Jain

Try using d+$ instead. That matches one or more numeric characters followed by the end of the string.

Answered By: Asad Saeeduddin

You can use re.match to find only the characters:

>>> import re
>>> s=r"""99-my-name-is-John-Smith-6376827-%^-1-2-767980716"""
>>> re.match('.*?([0-9]+)$', s).group(1)
'767980716'

Alternatively, re.finditer works just as well:

>>> next(re.finditer(r'd+$', s)).group(0)
'767980716'

Explanation of all regexp components:

  • .*? is a non-greedy match and consumes only as much as possible (a greedy match would consume everything except for the last digit).
  • [0-9] and d are two different ways of capturing digits. Note that the latter also matches digits in other writing schemes, like ୪ or ൨.
  • Parentheses (()) make the content of the expression a group, which can be retrieved with group(1) (or 2 for the second group, 0 for the whole match).
  • + means multiple entries (at least one number at the end).
  • $ matches only the end of the input.
Answered By: phihag

Nice and simple with findall:

import re

s=r"""99-my-name-is-John-Smith-6376827-%^-1-2-767980716"""

print re.findall('^.*-([0-9]+)$',s)

>>> ['767980716']

Regex Explanation:

^         # Match the start of the string
.*        # Followed by anthing
-         # Upto the last hyphen
([0-9]+)  # Capture the digits after the hyphen
$         # Upto the end of the string

Or more simply just match the digits followed at the end of the string '([0-9]+)$'

Answered By: Chris Seymour

Save the regular expressions for something that requires more heavy lifting.

>>> def parse_last_digits(line): return line.split('-')[-1]
>>> s = parse_last_digits(r"99-my-name-is-John-Smith-6376827-%^-1-2-767980716")
>>> s
'767980716'
Answered By: yurisich

I have been playing around with several of these solutions, but many seem to fail if there are no numeric digits at the end of the string. The following code should work.

import re

W = input("Enter a string:")
if re.match('.*?([0-9]+)$', W)== None:
    last_digits = "None"
else:
    last_digits = re.match('.*?([0-9]+)$', W).group(1)
print("Last digits of "+W+" are "+last_digits)
Answered By: Kenneth Watanabe
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.