Need to sort a numeric file and store last digit of each value in a list or array

Question:

I need to sort a numeric file that contains thousands of lines of numbers such as below.
Need the last digit to be represented in a list or array with the first 4 digits.

66542

66543

66546

66781

66783

66784

66787

would like to output to appear as:

6654[236]

6678[1347]

or something similar to shorten the file

I have tried the following but I am still way off as it only outputs last digit in an array [2, 3, 6, 1, 3, 4, 7]

   #!/usr/bin/env python3
   import re

   # Open the file and read the numbers
   with open('number-file.txt', 'r') as file:
   numbers = file.readlines()
   # Initialize an empty array to store the last digits# Loop
   last_digits = []
   # Loop through the numbers and store the last digit of each number in the array
   for number in numbers:
      last_digit = int(number.strip()) % 10
      last_digits.append(last_digit)

   print(last_digits)

'
Asked By: Chris Fritz

||

Answers:

from collections import defaultdict

with open("number-file.txt", "r") as infile:
    number_lines = infile.readlines()

results = defaultdict(list)

for line in number_lines:
    k = line[:4]
    v = int(line[4:].strip())
    results[k].append(v)

# pretty print the results
for k, v in results.items():
    print(f"{k}{v}")

For each line, we split each integer into the first 4 and the remaining digits, use the first part as the key to a defaultdict, and append the latter to a list. Then we print those results.

6654[2, 3, 6]
6678[1, 3, 4, 7]

If you need your output to appear exactly as you stated, you could instead do something like:

for k, v in results.items():
    print(f"{k}[{''.join([str(x) for x in v])}]")

6654[236]
6678[1347]
Answered By: Danielle M.

If your numbers are in order, you could use itertools.groupby to group by the first 4 digits and collect the last digits for each group:

from itertools import groupby

# simulated input data
numbers = ['66542', '66543', '66546', '66781', '66783', '66784', '66787']
parts = [(n[:4],n[4]) for n in numbers]
results = { k: [v[1] for v in g] for k, g in groupby(parts, key=lambda t:t[0]) }

Output:

{'6654': ['2', '3', '6'], '6678': ['1', '3', '4', '7']}

This can be formatted as desired:

'n'.join(f"{k}[{''.join(v)}]" for k, v in results.items())

Output:

6654[236]
6678[1347]
Answered By: Nick
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.