List Comprehension with Regular Expressions in a Text File Python

Question:

I’m doing a Python course and want to find all numbers in a text file with regular expression and sum them up.
Now I want to try to do it with list comprehension.

import re
try:
     fh = open(input('Enter a file Name: ')) #input
except:
    print('Enter an existing file name') #eror
    quit()
    
he = list() #store numbers
for lines in fh:
    lines.rstrip()
    stuff = re.findall('[0-9]+', lines)
    if len(stuff) == 0: #skip lines with no number
        continue
    else:
        for i in stuff:
            he.append(int(i)) #add numbers to storage
print(sum(he)) #print sum of stored numbers

This is my current code. The instructor said its possible to write the code in 2 lines or so.

import re
print( sum( [ ****** *** * in **********('[0-9]+',**************************.read()) ] ) )

the "*" should be replaced.

This text should be used to practice:

Why should you learn to write programs? 7746
12 1929 8827
Writing programs (or programming) is a very creative
7 and rewarding activity. You can write programs for
many reasons, ranging from making your living to solving
8837 a difficult data analysis problem to having fun to helping 128
someone else solve a problem. This book assumes that
everyone needs to know how to program …

I know the general concept of list comprehension but I have no idea what to do.

Asked By: Heheboi

||

Answers:

I think your instructor meant something like this:

import re
print(sum([int(i) for i in re.findall('[0-9]+', open(input('Enter a file Name: ')).read())]))

I spread it out into more lines so we can read it more easily:

print(
    sum([
        int(i) for i in re.findall(
            '[0-9]+', open(input('Enter a file Name: ')).read()
        )
    ])
)

To explain what is going on here, let’s replace the parts of your code step by step.

You can create the stuff variable in the same way as your original code in only one line:

stuff = re.findall('[0-9]+', open(input('Enter a file Name: ')).read())

All I did there was move the file opening, open(input('Enter a file Name: ')) into the re.findall(), and not bother doing for lines in fh.

Then, instead of doing a for loop, for i in stuff and adding int(i) into the he list one-by-one, we can use our first list comprehension:

he = [int(i) for i in stuff]

Or, if we replace stuff with what we wrote before,

he = [int(i) for i in re.findall('[0-9]+', open(input('Enter a file Name: ')).read())]

Finally, we put a sum around that to get the sum of all items in the list he that we have created.

Answered By: stelioslogothetis

The solution using list comprehension is:

import re

with open(input('Enter a file name: '), 'r') as fh:
     print(sum(int(i) for i in re.findall('[0-9]+', fh.read())))

Explanation:

• The with statement is used to open the file and automatically close it after the indented block is executed.

re.findall('[0-9]+', fh.read()) returns a list of all the numbers in the file as strings.

• The list comprehension int(i) for i in re.findall('[0-9]+', fh.read()) converts each string to an integer.

• Finally, sum() calculates the sum of all the integers in the list.

hope this solution helps you

File text:

enter image description here

import re
with open('./sum_numbers.txt', 'r') as f:
    # this is the line for sum all numbers in the file
    print(sum([int(no) for no in re.findall('d+', f.read())])) # 91
Answered By: Muhammad Ali
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.