Version number comparison in Python

Question:

I want to write a cmp-like function that compares two version numbers and returns -1, 0, or 1 based on their compared values.

  • Return -1 if version A is older than version B
  • Return 0 if versions A and B are equivalent
  • Return 1 if version A is newer than version B

Each subsection is supposed to be interpreted as a number, therefore 1.10 > 1.1.

Desired function outputs are

mycmp('1.0', '1') == 0
mycmp('1.0.0', '1') == 0
mycmp('1', '1.0.0.1') == -1
mycmp('12.10', '11.0.0.0.0') == 1
...

And here is my implementation, open for improvement:

def mycmp(version1, version2):
    parts1 = [int(x) for x in version1.split('.')]
    parts2 = [int(x) for x in version2.split('.')]

    # fill up the shorter version with zeros ...
    lendiff = len(parts1) - len(parts2)
    if lendiff > 0:
        parts2.extend([0] * lendiff)
    elif lendiff < 0:
        parts1.extend([0] * (-lendiff))

    for i, p in enumerate(parts1):
        ret = cmp(p, parts2[i])
        if ret: return ret
    return 0

I’m using Python 2.4.5 btw. (installed at my working place …).

Here’s a small ‘test suite’ you can use

assert mycmp('1', '2') == -1
assert mycmp('2', '1') == 1
assert mycmp('1', '1') == 0
assert mycmp('1.0', '1') == 0
assert mycmp('1', '1.000') == 0
assert mycmp('12.01', '12.1') == 0
assert mycmp('13.0.1', '13.00.02') == -1
assert mycmp('1.1.1.1', '1.1.1.1') == 0
assert mycmp('1.1.1.2', '1.1.1.1') == 1
assert mycmp('1.1.3', '1.1.3.000') == 0
assert mycmp('3.1.1.0', '3.1.2.10') == -1
assert mycmp('1.1', '1.10') == -1
Asked By: Johannes Charra

||

Answers:

No need to iterate over the version tuples. The built in comparison operator on lists and tuples already works exactly like you want it. You’ll just need to zero extend the version lists to the corresponding length. With python 2.6 you can use izip_longest to pad the sequences.

from itertools import izip_longest
def version_cmp(v1, v2):
    parts1, parts2 = [map(int, v.split('.')) for v in [v1, v2]]
    parts1, parts2 = zip(*izip_longest(parts1, parts2, fillvalue=0))
    return cmp(parts1, parts2)

With lower versions, some map hackery is required.

def version_cmp(v1, v2):
    parts1, parts2 = [map(int, v.split('.')) for v in [v1, v2]]
    parts1, parts2 = zip(*map(lambda p1,p2: (p1 or 0, p2 or 0), parts1, parts2))
    return cmp(parts1, parts2)
Answered By: Ants Aasma

This is a little more compact than your suggestion. Rather than filling the shorter version with zeros, I’m removing trailing zeros from the version lists after splitting.

def normalize_version(v):
    parts = [int(x) for x in v.split(".")]
    while parts[-1] == 0:
        parts.pop()
    return parts

def mycmp(v1, v2):
    return cmp(normalize_version(v1), normalize_version(v2))
Answered By: Pär Wieslander

Remove the uninteresting part of the string (trailing zeroes and dots), and then compare the lists of numbers.

import re

def mycmp(version1, version2):
    def normalize(v):
        return [int(x) for x in re.sub(r'(.0+)*$','', v).split(".")]
    return cmp(normalize(version1), normalize(version2))

This is the same approach as Pär Wieslander, but a bit more compact:

Here are some tests, thanks to “How to compare two strings in dot separated version format in Bash?“:

assert mycmp("1", "1") == 0
assert mycmp("2.1", "2.2") < 0
assert mycmp("3.0.4.10", "3.0.4.2") > 0
assert mycmp("4.08", "4.08.01") < 0
assert mycmp("3.2.1.9.8144", "3.2") > 0
assert mycmp("3.2", "3.2.1.9.8144") < 0
assert mycmp("1.2", "2.1") < 0
assert mycmp("2.1", "1.2") > 0
assert mycmp("5.6.7", "5.6.7") == 0
assert mycmp("1.01.1", "1.1.1") == 0
assert mycmp("1.1.1", "1.01.1") == 0
assert mycmp("1", "1.0") == 0
assert mycmp("1.0", "1") == 0
assert mycmp("1.0", "1.0.1") < 0
assert mycmp("1.0.1", "1.0") > 0
assert mycmp("1.0.2.0", "1.0.2") == 0
Answered By: gnud

Remove trailing .0 and .00 with regex, split and use cmp function which compares arrays correctly:

def mycmp(v1,v2):
 c1=map(int,re.sub('(.0+)+Z','',v1).split('.'))
 c2=map(int,re.sub('(.0+)+Z','',v2).split('.'))
 return cmp(c1,c2)

And, of course, you can convert it to a one-liner if you don’t mind the long lines.

Answered By: yu_sha
def compare_version(v1, v2):
    return cmp(*tuple(zip(*map(lambda x, y: (x or 0, y or 0), 
           [int(x) for x in v1.split('.')], [int(y) for y in v2.split('.')]))))

It’s a one liner (split for legability). Not sure about readable…

Answered By: mavnn

Is reuse considered elegance in this instance? 🙂

# pkg_resources is in setuptools
# See http://peak.telecommunity.com/DevCenter/PkgResources#parsing-utilities
def mycmp(a, b):
    from pkg_resources import parse_version as V
    return cmp(V(a),V(b))
Answered By: conny

The most difficult to read solution, but a one-liner nevertheless! and using iterators to be fast.

next((c for c in imap(lambda x,y:cmp(int(x or 0),int(y or 0)),
            v1.split('.'),v2.split('.')) if c), 0)

that is for Python2.6 and 3.+ btw, Python 2.5 and older need to catch the StopIteration.

Answered By: Paul

My preferred solution:

Padding the string with extra zeroes and just using the four first is easy to understand,
doesn’t require any regex and the lambda is more or less readable. I use two lines for readability, for me elegance is short and simple.

def mycmp(version1,version2):
  tup = lambda x: [int(y) for y in (x+'.0.0.0.0').split('.')][:4]
  return cmp(tup(version1),tup(version2))
Answered By: daramarak

How about using Python’s distutils.version.StrictVersion?

>>> from distutils.version import StrictVersion
>>> StrictVersion('10.4.10') > StrictVersion('10.4.9')
True

So for your cmp function:

>>> cmp = lambda x, y: StrictVersion(x).__cmp__(y)
>>> cmp("10.4.10", "10.4.11")
-1

If you want to compare version numbers that are more complex distutils.version.LooseVersion will be more useful, however be sure to only compare the same types.

>>> from distutils.version import LooseVersion, StrictVersion
>>> LooseVersion('1.4c3') > LooseVersion('1.3')
True
>>> LooseVersion('1.4c3') > StrictVersion('1.3')  # different types
False

LooseVersion isn’t the most intelligent tool, and can easily be tricked:

>>> LooseVersion('1.4') > LooseVersion('1.4-rc1')
False

To have success with this breed, you’ll need to step outside the standard library and use setuptools‘s parsing utility parse_version.

>>> from pkg_resources import parse_version
>>> parse_version('1.4') > parse_version('1.4-rc2')
True

So depending on your specific use-case, you’ll need to decide whether the builtin distutils tools are enough, or if it’s warranted to add as a dependency setuptools.

Answered By: bradley.ayers

This is my solution (written in C, sorry). I hope you’ll find it useful

int compare_versions(const char *s1, const char *s2) {
    while(*s1 && *s2) {
        if(isdigit(*s1) && isdigit(*s2)) {
            /* compare as two decimal integers */
            int s1_i = strtol(s1, &s1, 10);
            int s2_i = strtol(s2, &s2, 10);

            if(s1_i != s2_i) return s1_i - s2_i;
        } else {
            /* compare as two strings */
            while(*s1 && !isdigit(*s1) && *s2 == *s1) {
                s1++;
                s2++;
            }

            int s1_i = isdigit(*s1) ? 0 : *s1;
            int s2_i = isdigit(*s2) ? 0 : *s2;

            if(s1_i != s2_i) return s1_i - s2_i;
        }
    }

    return 0;
}
Answered By: e_asphyx

Lists are comparable in Python, so if someone converts the strings representing the numbers into integers, the basic Python comparison can be used with success.

I needed to extend this approach a bit because I use Python3x where the cmp function does not exist any more. I had to emulate cmp(a,b) with (a > b) - (a < b). And, version numbers are not that clean at all, and can contain all kind of other alphanumeric characters. There are cases when the function can’t tell the order so it returns False (see the first example).

So I’m posting this even if the question is old and answered already, because it may save a few minutes in someone’s life.

import re

def _preprocess(v, separator, ignorecase):
    if ignorecase: v = v.lower()
    return [int(x) if x.isdigit() else [int(y) if y.isdigit() else y for y in re.findall("d+|[a-zA-Z]+", x)] for x in v.split(separator)]

def compare(a, b, separator = '.', ignorecase = True):
    a = _preprocess(a, separator, ignorecase)
    b = _preprocess(b, separator, ignorecase)
    try:
        return (a > b) - (a < b)
    except:
        return False

print(compare('1.0', 'beta13'))    
print(compare('1.1.2', '1.1.2'))
print(compare('1.2.2', '1.1.2'))
print(compare('1.1.beta1', '1.1.beta2'))
Answered By: sanyi

In case you don’t want to pull in an external dependency here is my attempt written for Python 3.x.

rc, rel (and possibly one could add c) are regarded as “release candidate” and divide the version number into two parts and if missing the value of the second part is high (999). Else letters produce a split and are dealt as sub-numbers via base-36 code.

import re
from itertools import chain
def compare_version(version1,version2):
    '''compares two version numbers
    >>> compare_version('1', '2') < 0
    True
    >>> compare_version('2', '1') > 0
    True
    >>> compare_version('1', '1') == 0
    True
    >>> compare_version('1.0', '1') == 0
    True
    >>> compare_version('1', '1.000') == 0
    True
    >>> compare_version('12.01', '12.1') == 0
    True
    >>> compare_version('13.0.1', '13.00.02') <0
    True
    >>> compare_version('1.1.1.1', '1.1.1.1') == 0
    True
    >>> compare_version('1.1.1.2', '1.1.1.1') >0
    True
    >>> compare_version('1.1.3', '1.1.3.000') == 0
    True
    >>> compare_version('3.1.1.0', '3.1.2.10') <0
    True
    >>> compare_version('1.1', '1.10') <0
    True
    >>> compare_version('1.1.2','1.1.2') == 0
    True
    >>> compare_version('1.1.2','1.1.1') > 0
    True
    >>> compare_version('1.2','1.1.1') > 0
    True
    >>> compare_version('1.1.1-rc2','1.1.1-rc1') > 0
    True
    >>> compare_version('1.1.1a-rc2','1.1.1a-rc1') > 0
    True
    >>> compare_version('1.1.10-rc1','1.1.1a-rc2') > 0
    True
    >>> compare_version('1.1.1a-rc2','1.1.2-rc1') < 0
    True
    >>> compare_version('1.11','1.10.9') > 0
    True
    >>> compare_version('1.4','1.4-rc1') > 0
    True
    >>> compare_version('1.4c3','1.3') > 0
    True
    >>> compare_version('2.8.7rel.2','2.8.7rel.1') > 0
    True
    >>> compare_version('2.8.7.1rel.2','2.8.7rel.1') > 0
    True

    '''
    chn = lambda x:chain.from_iterable(x)
    def split_chrs(strings,chars):
        for ch in chars:
            strings = chn( [e.split(ch) for e in strings] )
        return strings
    split_digit_char=lambda x:[s for s in re.split(r'([a-zA-Z]+)',x) if len(s)>0]
    splt = lambda x:[split_digit_char(y) for y in split_chrs([x],'.-_')]
    def pad(c1,c2,f='0'):
        while len(c1) > len(c2): c2+=[f]
        while len(c2) > len(c1): c1+=[f]
    def base_code(ints,base):
        res=0
        for i in ints:
            res=base*res+i
        return res
    ABS = lambda lst: [abs(x) for x in lst]
    def cmp(v1,v2):
        c1 = splt(v1)
        c2 = splt(v2)
        pad(c1,c2,['0'])
        for i in range(len(c1)): pad(c1[i],c2[i])
        cc1 = [int(c,36) for c in chn(c1)]
        cc2 = [int(c,36) for c in chn(c2)]
        maxint = max(ABS(cc1+cc2))+1
        return base_code(cc1,maxint) - base_code(cc2,maxint)
    v_main_1, v_sub_1 = version1,'999'
    v_main_2, v_sub_2 = version2,'999'
    try:
        v_main_1, v_sub_1 = tuple(re.split('rel|rc',version1))
    except:
        pass
    try:
        v_main_2, v_sub_2 = tuple(re.split('rel|rc',version2))
    except:
        pass
    cmp_res=[cmp(v_main_1,v_main_2),cmp(v_sub_1,v_sub_2)]
    res = base_code(cmp_res,max(ABS(cmp_res))+1)
    return res


import random
from functools import cmp_to_key
random.shuffle(versions)
versions.sort(key=cmp_to_key(compare_version))
Answered By: Roland Puntaier
from distutils.version import StrictVersion
def version_compare(v1, v2, op=None):
    _map = {
        '<': [-1],
        'lt': [-1],
        '<=': [-1, 0],
        'le': [-1, 0],
        '>': [1],
        'gt': [1],
        '>=': [1, 0],
        'ge': [1, 0],
        '==': [0],
        'eq': [0],
        '!=': [-1, 1],
        'ne': [-1, 1],
        '<>': [-1, 1]
    }
    v1 = StrictVersion(v1)
    v2 = StrictVersion(v2)
    result = cmp(v1, v2)
    if op:
        assert op in _map.keys()
        return result in _map[op]
    return result

Implement for php version_compare, except “=”. Because it’s ambiguous.

Answered By: Ryan Fau

Another solution:

def mycmp(v1, v2):
    import itertools as it
    f = lambda v: list(it.dropwhile(lambda x: x == 0, map(int, v.split('.'))[::-1]))[::-1]
    return cmp(f(v1), f(v2))

One can use like this too:

import itertools as it
f = lambda v: list(it.dropwhile(lambda x: x == 0, map(int, v.split('.'))[::-1]))[::-1]
f(v1) <  f(v2)
f(v1) == f(v2)
f(v1) >  f(v2)
Answered By: pedrormjunior

I did this in order to be able to parse and compare the Debian package version string. Please notice that it is not strict with the character validation.

This might be helpful as well:

#!/usr/bin/env python

# Read <https://www.debian.org/doc/debian-policy/ch-controlfields.html#s-f-Version> for further informations.

class CommonVersion(object):
    def __init__(self, version_string):
        self.version_string = version_string
        self.tags = []
        self.parse()

    def parse(self):
        parts = self.version_string.split('~')
        self.version_string = parts[0]
        if len(parts) > 1:
            self.tags = parts[1:]


    def __lt__(self, other):
        if self.version_string < other.version_string:
            return True
        for index, tag in enumerate(self.tags):
            if index not in other.tags:
                return True
            if self.tags[index] < other.tags[index]:
                return True

    @staticmethod
    def create(version_string):
        return UpstreamVersion(version_string)

class UpstreamVersion(CommonVersion):
    pass

class DebianMaintainerVersion(CommonVersion):
    pass

class CompoundDebianVersion(object):
    def __init__(self, epoch, upstream_version, debian_version):
        self.epoch = epoch
        self.upstream_version = UpstreamVersion.create(upstream_version)
        self.debian_version = DebianMaintainerVersion.create(debian_version)

    @staticmethod
    def create(version_string):
        version_string = version_string.strip()
        epoch = 0
        upstream_version = None
        debian_version = '0'

        epoch_check = version_string.split(':')
        if epoch_check[0].isdigit():
            epoch = int(epoch_check[0])
            version_string = ':'.join(epoch_check[1:])
        debian_version_check = version_string.split('-')
        if len(debian_version_check) > 1:
            debian_version = debian_version_check[-1]
            version_string = '-'.join(debian_version_check[0:-1])

        upstream_version = version_string

        return CompoundDebianVersion(epoch, upstream_version, debian_version)

    def __repr__(self):
        return '{} {}'.format(self.__class__.__name__, vars(self))

    def __lt__(self, other):
        if self.epoch < other.epoch:
            return True
        if self.upstream_version < other.upstream_version:
            return True
        if self.debian_version < other.debian_version:
            return True
        return False


if __name__ == '__main__':
    def lt(a, b):
        assert(CompoundDebianVersion.create(a) < CompoundDebianVersion.create(b))

    # test epoch
    lt('1:44.5.6', '2:44.5.6')
    lt('1:44.5.6', '1:44.5.7')
    lt('1:44.5.6', '1:44.5.7')
    lt('1:44.5.6', '2:44.5.6')
    lt('  44.5.6', '1:44.5.6')

    # test upstream version (plus tags)
    lt('1.2.3~rc7',          '1.2.3')
    lt('1.2.3~rc1',          '1.2.3~rc2')
    lt('1.2.3~rc1~nightly1', '1.2.3~rc1')
    lt('1.2.3~rc1~nightly2', '1.2.3~rc1')
    lt('1.2.3~rc1~nightly1', '1.2.3~rc1~nightly2')
    lt('1.2.3~rc1~nightly1', '1.2.3~rc2~nightly1')

    # test debian maintainer version
    lt('44.5.6-lts1', '44.5.6-lts12')
    lt('44.5.6-lts1', '44.5.7-lts1')
    lt('44.5.6-lts1', '44.5.7-lts2')
    lt('44.5.6-lts1', '44.5.6-lts2')
    lt('44.5.6-lts1', '44.5.6-lts2')
    lt('44.5.6',      '44.5.6-lts1')
Answered By: Pius Raeder

i’m using this one on my project:

cmp(v1.split("."), v2.split(".")) >= 0
Answered By: Keyrr Perino

Years later, but stil this question is on the top.

Here is my version sort function. It splits version into numbers and non-numbers sections. Numbers are compared as int rest as str (as parts of list items).

def sort_version_2(data):
    def key(n):
        a = re.split(r'(d+)', n)
        a[1::2] = map(int, a[1::2])
        return a
    return sorted(data, key=lambda n: key(n))

You can use function key as kind of custom Version type with compare operators. If out really want to use cmp you can do it like in this example: https://stackoverflow.com/a/22490617/9935708

def Version(s):
    s = re.sub(r'(.0*)*$', '', s)  # to avoid ".0" at end
    a = re.split(r'(d+)', s)
    a[1::2] = map(int, a[1::2])
    return a

def mycmp(a, b):
    a, b = Version(a), Version(b)
    return (a > b) - (a < b)  # DSM's answer

Test suite passes.

Answered By: rysson
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.