I want to extract specific lines which contains x & y data from a TXT file and sort them by y data

Question

this is a screen shot of my TXT file and I have determined the part that I should extraxt it

I want to extract specific lines which contain sales data including the items and their sales amount from a TXT file and sort them by their sales amount.
the result should be like this showing each items with their index according to their sorting index by their amount in separate lines without showing their amount:
1- Citem
2- eitem
3- Ditem
4- Xitem
5- aitem
6- bitem
7- Yitem
I use this code: and I face an error

with open ('myfile', 'r') as myfile:
    for myline in myfile:
        if "sold" in myline:
            item, amount = myline.split('(')
            for index, item in enumerate((amount)):
                print(index, item.rstrip("n"))
[this the result (whole code)][1]
``` when I just extract the items without indexing and sorting them by amount its ok with the code below: but its not the answer that I want

with open ('myfile.txt', 'rt') as myfile:
    for myline in myfile:
        if "sold" in myline:
            Item, Amount = myline.split('(')
          
            print(Item.rstrip("n"))

[Just extracting the Items without sorting them by amount][2]


  [1]: https://i.stack.imgur.com/4gsUR.png
  [2]: https://i.stack.imgur.com/nD2ha.png

Asked By: Mary E

||

Source

Answer 1

If the text file format is exactly as shown then you could do this:

items = []

m = {'thousand': 1_000, 'million': 1_000_000}

with open('myfile.txt') as data:
    for line in data:
        if 'sold)' in line:
            item, *e = line.split()
            n = float(e[-3][1:]) * m.get(e[-2], 1)
            items.append((n, item))

print(sorted(items))

Output:

[(13000.0, 'Yitem'), (120000.0, 'bitem'), (191000.0, 'aitem'), (2000000.0, 'Xitem'), (7200000.0, 'Ditem'), (32000000.0, 'eitem'), (96300000.0, 'Citem')]

Answered By: Cobra

Answer 2

Here is one way of doing this:

with open ('myfile', 'r') as myfile:
    data = myfile.readlines()

# match "thousand" and "million" to a number
scalekey = {"thousand": 1000, "million": 1000000}

items = []  # store item names
saleamount = []  # store amount of sales

# loop through the data
for line in data:
    if "sold" in line:
        itemline = line.strip().replace("sold", "")  # remove sold from the line
        items.append(itemline.split()[0])  # item name is the first value
        
        # get the number of sold items from between the brackets
        nsold = itemline[itemline.index("(") + 1: itemline.index(")")]
        
        # convert the number into an integer
        intnum = int(nsold.strip().split()[0])
        
        # get the scale factor (i.e., thousand or million)
        scale = scalekey[nsold.strip().split()[1]]
        saleamount.append(intnum * scale)

# perform sorting
sortedlist = sorted(zip(saleamount, items))

print(sortedlist)

Answered By: Matt Pitkin

I want to extract specific lines which contains x & y data from a TXT file and sort them by y data

Question:

Answers: