Python regex to extract version from a string

Question:

The string looks like this: (n used to break the line)

MySQL-vm
Version 1.0.1

WARNING:: NEVER EDIT/DELETE THIS SECTION

What I want is only 1.0.1 .

I am trying re.search(r"Version+'([^']*)'", my_string, re.M).group(1) but it is not working.

re.findall(r'd+', version) is giving me an array of the numbers which again I have to append.

How can I improve the regex ?

Asked By: Amby

||

Answers:

Use the below regex and get the version number from group index 1.

Versions*([d.]+)

DEMO

>>> import re
>>> s = """MySQL-vm
... Version 1.0.1
... 
... WARNING:: NEVER EDIT/DELETE THIS SECTION"""
>>> re.search(r'Versions*([d.]+)', s).group(1)
'1.0.1'

Explanation:

Version                  'Version'
s*                      whitespace (n, r, t, f, and " ") (0 or
                         more times)
(                        group and capture to 1:
  [d.]+                   any character of: digits (0-9), '.' (1
                           or more times)
)                        end of 1
Answered By: Avinash Raj

You can try with Positive Look behind as well that do not consume characters in the string, but only assert whether a match is possible or not. In below regex you don’t need to findAll and group functions.

(?<=Version )[d.]+

Online demo

Explanation:

  (?<=                     look behind to see if there is:
    Version                  'Version '
  )                        end of look-behind
  [d.]+                   any character of: digits (0-9), '.' (1 or more times)
Answered By: Braj
(?<=Versions)S+

Try this.Use this with re.findall.

x="""MySQL-vm
  Version 1.0.1

  WARNING:: NEVER EDIT/DELETE THIS SECTION"""

print re.findall(r"(?<=Versions)S+",x)

Output:[‘1.0.1’]

See demo.

http://regex101.com/r/dK1xR4/12

Answered By: vks

https://regex101.com/r/5Us6ow/1

Bit recursive to match versions like 1, 1.0, 1.0.1:

def version_parser(v):
    versionPattern = r'd+(=?.(d+(=?.(d+)*)*)*)*'
    regexMatcher = re.compile(versionPattern)
    return regexMatcher.search(v).group(0)
Answered By: yourstruly

Old question but none of the answers cover corner cases such as Version 1.2.3. (ending with dot) or Version 1.2.3.A (ending with non-numeric values)
Here is my solution:

ver = "Version 1.2.3.9nWarning blah blah..."
print(bool(re.match("Versions*[d.]+d", ver)))
Answered By: Payam

We can use the python re library.
The regex described is for versions containing numbers only.

import re

versions = re.findall('[0-9]+.[0-9]+.?[0-9]*', AVAILABLE_VERSIONS)

unique_versions = set(versions) # convert it to set to get unique versions

Where
AVAILABLE_VERSIONS is string containing versions.

Answered By: Rittick Paul
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.