Sort a txt file based on numbers

Question:

I have a txt file of data that looks like this:

@0 #1
@30 #2
@750 #2
@20 #3
@500 #3
@2500 #4
@460 #4
@800 #1
@2200 #1
@290 #2
@4700 #4
@570 #1

How do I sort the file based on the integer between the @ and #?

The output should be:

@0 #1
@20 #3
@30 #2
@290 #2
@460 #4
@500 #3
@570 #1
@750 #2
@800 #1
@2200 #1
@2500 #4
@4700 #4
Asked By: Neeraja

||

Answers:

You just need to read in the text and split it by new lines, then use the sorted function using only the integer part of line as the key.

with open('my_text_file.txt') as textfile:
    lines = textfile.read().split('n')    # ['@0 #1', '@30 #2', '@750 #2', '@20 #3', '@500 #3', '@2500 #4', '@460 #4', '@800 #1', '@2200 #1', '@290 #2', '@4700 #4', '@570 #1']
    lines = sorted(lines, key=lambda i: int(i[1:i.index('#') -1]))  # ['@0 #1', '@20 #3', '@30 #2', '@290 #2', '@460 #4', '@500 #3', '@570 #1', '@750 #2', '@800 #1', '@2200 #1', '@2500 #4', '@4700 #4']
    txt = 'n'.join(lines)

with open('my_new_text_file.txt', 'wt') as textfile:
    textfile.write(txt)

output

@0 #1
@20 #3
@30 #2
@290 #2
@460 #4
@500 #3
@570 #1
@750 #2
@800 #1
@2200 #1
@2500 #4
@4700 #4
Answered By: Alexander

This can be done cleanly with a regular expression

import re

with open("somefile.txt") as file:
    lines = sorted((line.strip() for line in file), 
            key=lambda s: int(re.match(r"@(d+)s*#", s).group(1)))
print(lines)

This will raise an error if any strings don’t match the pattern, which is the right thing to do if the file format is strict. You could instead write a function that checks the regex and returns a default value.

Answered By: tdelaney
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.