Validate number with multiple points

Question:

I want to remove a version number for a string that has multiple dots, for instance app-9.6.0 should return 9.6.0 and app-960 should return None. I tried with the code bellow but it returns numbers without the dots either.

import re
re.findall(r'[.d]+', 'app-960')

How can I implement a parser or regex for this case?

Asked By: daniboy000

||

Answers:

Try this:

import re
str_for_search = 'app-9.6.0'
search = re.search(r'w+-(d+.d+.d+)', str_for_search)
if search:
    version = search.group(1)
else:
    version = None
print(version)
Answered By: 555Russich

is it always 2 dots?

if so you can do this

import re
re.findall(r'd+.d+.d+', 'app-960')
re.findall(r'd+.d+.d+', 'app-9.6.0')

if no dots is ok then what you got already works

if you want atleast 1 dot you can do:

re.findall(r'd.[.d]+', 'app-9.6.0')

edit:

you can do this to avoid the multiple dots in a row problem:

import re
re.findall(r'd(?:.d)+', 'app-9.6.0 app=9..9 app1.1.1.1.1.1 app1')
Answered By: DorElias

If app- is always going to be a consistent string, it’s probably going to be faster and easier to do this without a regex:

def version_number(full_version):
    version_num = full_version.replace("app-", "", 1)
    # If I'm interpreting your question right, you also
    # want to validate that the version contains at least one dot
    if "." in version_num:
        return version_num
    else:
        return None

If you need to do this with a regex for some reason, others have given examples that should work.

Answered By: chucklay

If you want to match app- and digits with 1 or more dots in between, you can use a capture group.

Start the capture with matching digits and then repeat 1 or more times matching a hyphen and again 1 or more digits.

Example

import re

pattern = r'app-(d+(?:.d+)+)'
s = 'app-1 app-9.6.0, app-1.0, app-1.1.0, app-1.10.0, app-1.1.1.0'

print (re.findall(pattern, s))

Output

['9.6.0', '1.0', '1.1.0', '1.10.0', '1.1.1.0']

A broader variant matching 1+ non whitespace chars with S+ before the hyphen:

pattern = r'S+-(d+(?:.d+)+)'
Answered By: The fourth bird

To be more general and avoid to be sticked to "app-" , i suggest this old way of programming with a simple algorithm:
As long as the end of the string is made of numbers or point, collect it.
At the end, check if a point. That’s all.

def old_geek(someString):
version = ""
pointFound = False
# read in reverse order
for i in range(len(someString)-1,-1,-1):
    c = someString[i]
    if not c in ".0123456789":
        break
    version = c + version
    pointFound = pointFound or (c == '.')
if not pointFound: 
   version = None
return version

I test it against the regexp of 555Russich which is not sticked to app- (but has still some default with thisIsMyApp ):

print("------ old geek -----")
for test in tests:
    print(test,' :',old_geek(test))

print("------- 555 Rushich")
for test in tests:
    print(test,' :',reg1(test))

------ old geek -----
app-9.6.0  : 9.6.0
app-960  : None
thisIsMyApp-123456.0  : 123456.0
appli.bat-200.0.1  : 200.0.1
app-1.1.1.1.1.0  : 1.1.1.1.1.0
------- 555 Rushich
app-9.6.0  : 9.6.0
app-960  : None
thisIsMyApp-123456.0  : None
appli.bat-200.0.1  : 200.0.1
app-1.1.1.1.1.0  : 1.1.1

A question i asked me : what’s the performance between old way of programming and regexp ?

    loop : 10000
check_oldGeek --- 0.03163409233093262 seconds 
loop : 100000
check_oldGeek --- 0.31208014488220215 seconds 
loop : 1000000
check_oldGeek --- 3.101634979248047 seconds 
loop : 10000
reg1 --- 0.03866410255432129 seconds 
loop : 100000
reg1 --- 0.3761019706726074 seconds 
loop : 1000000
reg1 --- 3.765758991241455 seconds 

old geek wins : 3.10s against 3.76s … for 1 million loops . not too much.

Hope you enjoy as i do 🙂

Answered By: pirela
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.