Sorting os.listdir()'s arbitrary order for desired output

Question:

Image of Txt Files in Folder
I currently possess a folder with .txt files that are named "Month day.txt"

Running os.listdir(‘.’) outputs in this order:

["April 30.txt", "April 4.txt", "May 1.txt", "May 10.txt", "May 11.txt", "May 2.txt"]

The order I would like to output:

["April 4.txt", "April 30.txt", "May 1.txt", "May 2.txt", "May 10.txt", "May 11.txt"]

To get it, I have a feeling to somehow sort the names by numbers first and then alphabetical.
I spent the last hour researching similar problems, and the best I was able to find was this.

files = os.listdir('.')
re_pattern = re.compile('.+?(d+).([a-zA-Z0-9+])')
files_ordered = sorted(files, key=lambda x: int(re_pattern.match(x).groups()[0]))

Which I understood used a regex to capture the digits and used it as a key for sorting,
which sensibly arranged the files based on the numbers only. (May 12, April 13, etc)
Furthermore, I tried to mess around with it using capture groups on the month, but to no avail.

Asked By: Zen

||

Answers:

You can actually sort this rather easily with the datetime module.

For example:

from datetime import datetime
a = ["April 30.txt", "April 4.txt", "May 1.txt", "May 10.txt", "May 11.txt", "May 2.txt"]

b = sorted(a,key=lambda x: datetime.strptime(x[:-4], "%B %d"))
print(b)

output:

['April 4.txt', 'April 30.txt', 'May 1.txt', 'May 2.txt', 'May 10.txt', 'May 11.txt']

What this does is it takes the sequence of filenames, and for each filename it removes the extension .txt and then converts the remaining date string into a date object with datetime.strptime function. The builtin sorted function automatically knows how to sot date objects.

If you were in a situation where you had multiple file extensions with different lengths, e.g. .txt , .json instead of using x[:-4] you could instead use os.path.splitext(x)[0] or use regex to isolate and remove the file extension.

Answered By: Alexander
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.