Sort a txt file based on numbers

Question

I have a txt file of data that looks like this:

@0 #1
@30 #2
@750 #2
@20 #3
@500 #3
@2500 #4
@460 #4
@800 #1
@2200 #1
@290 #2
@4700 #4
@570 #1

How do I sort the file based on the integer between the @ and #?

The output should be:

@0 #1
@20 #3
@30 #2
@290 #2
@460 #4
@500 #3
@570 #1
@750 #2
@800 #1
@2200 #1
@2500 #4
@4700 #4

Asked By: Neeraja

||

Source

Answer 1

You just need to read in the text and split it by new lines, then use the sorted function using only the integer part of line as the key.

with open('my_text_file.txt') as textfile:
    lines = textfile.read().split('n')    # ['@0 #1', '@30 #2', '@750 #2', '@20 #3', '@500 #3', '@2500 #4', '@460 #4', '@800 #1', '@2200 #1', '@290 #2', '@4700 #4', '@570 #1']
    lines = sorted(lines, key=lambda i: int(i[1:i.index('#') -1]))  # ['@0 #1', '@20 #3', '@30 #2', '@290 #2', '@460 #4', '@500 #3', '@570 #1', '@750 #2', '@800 #1', '@2200 #1', '@2500 #4', '@4700 #4']
    txt = 'n'.join(lines)

with open('my_new_text_file.txt', 'wt') as textfile:
    textfile.write(txt)

output

@0 #1
@20 #3
@30 #2
@290 #2
@460 #4
@500 #3
@570 #1
@750 #2
@800 #1
@2200 #1
@2500 #4
@4700 #4

Answered By: Alexander

Answer 2

This can be done cleanly with a regular expression

import re

with open("somefile.txt") as file:
    lines = sorted((line.strip() for line in file), 
            key=lambda s: int(re.match(r"@(d+)s*#", s).group(1)))
print(lines)

This will raise an error if any strings don’t match the pattern, which is the right thing to do if the file format is strict. You could instead write a function that checks the regex and returns a default value.

Answered By: tdelaney

Sort a txt file based on numbers

Question:

Answers: