Sorting list of strings that contain number with Python

Question:

I have list of strings that look like this:

l = ["sometext-2022-21_sometext", 
"sometext-2022-4_sometext", 
"sometext-2022-121_sometext",
"sometext-2022-321_sometext", 
"sometext-2022-1_sometext",
"sometext-2022-31_sometext"]

I want to sort it so that biggest number be on top, for example:

l = ["sometext-2022-321_sometext", 
"sometext-2022-121_sometext", 
"sometext-2022-31_sometext",
"sometext-2022-21_sometext", 
"sometext-2022-4_sometext",
"sometext-2022-1_sometext"]

How can I do that, since sort does not work for this problem?
PS
In this case the string with biggest number is longest, but that does not have to be the case, strings can have different texts.

This is my real list of strings:

["https://essgfsgffghe.ch/docs/BS_Omni/BS_APG_001_AUS-2022-401_nodate.html", 
"https://ensfgtsfgscheidsuche.ch/docs/BS_Omni/BS_APG_001_BEZ-2022-39_nodate.html", 
"https://ensfgtsfguche.ch/docs/BS_Omni/BS_APG_001_VD-2022-5_nodate.html", 
"https://egsfsfgtscheidsuche.ch/docs/BS_Omni/BS_APG_001_VD-2022-2_nodate.html", 
"https://ensfgidsuche.ch/docs/BS_Omni/BS_APG_001_BES-2022-83_nodate.html", 
"https://sfgfgnsfgche.ch/docs/BS_Omni/BS_SVG_001_IV-2022-54_nodate.html", 
"https://entscsfghe.ch/docs/BS_Omni/BS_APG_001_BES-2022-36_nodate.html", 
"https://essntscsfguche.ch/docs/BS_Omni/BS_APG_001_AUS-2022-32_nodate.html", 
"https://entsfgeidsuche.ch/docs/BS_Omni/BS_APG_001_BES-2022-89_nodate.html", 
"https://entfsfgsgsfuche.ch/docs/BS_Omni/BS_APG_001_AUS-2022-412_nodate.html", 
"https://ensfgche.ch/docs/BS_Omni/BS_APG_001_VD-2022-5_nodate.html", 
"https://ensfgse.ch/docs/BS_Omni/BS_APG_001_BES-2022-70_nodate.html", 
"https://esfgche.ch/docs/BS_Omni/BS_APG_001_VD-2022-1_nodate.html"]
Asked By: taga

||

Answers:

You may be able to use a regular expression as long as the pattern searched is not also contained in "sometext", resulting in false positives. list.sort takes a function that returns a key to use for sorting instead of of the original value in the list. In this case, we can convert string digits to a tuple of integers that will sort the way you want.

import re

l = ["sometext-2022-21_sometext",
"sometext-2022-4_sometext",
"sometext-2022-121_sometext",
"sometext-2022-321_sometext",
"sometext-2022-1_sometext",
"sometext-2022-31_sometext"]

def l_sort_key(s):
    v1, v2 = re.search(r"-(d+)-(d+)_", s).groups()
    return int(v1), int(v2)
    
l.sort(key=l_sort_key, reverse=True)
print(l)
Answered By: tdelaney

You can use sorted(iterable, key=func, reverse=True) with the func defined to pull out the thing you actually want to sort by. In your case, could be something like this.

def sort_func(item):
    '''returns list of [2022, ###]'''
    vals = item.split('_')
    nums = vals[0].split('-')[1:3]
    nums = [int(num) for num in nums]
    return nums
    
sorted_list = sorted(l, key=sort_func, reverse=True)

If the way to identify the number in the middle is more complex, then you may indeed need a more robust regex solution

Answered By: scotscotmcc
Categories: questions Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.