Sum of strings extracted from text file using regex

Question:

I am just learning python and need some help for my class assignment.

I have a file with text and numbers in it. Some lines have from one to three numbers and others have no numbers at all.

I need to:

  1. Extract numbers only from the file using regex

  2. Find the sum of all the numbers

I used regex to extract out all the numbers. I am trying to get the total sum of all the numbers but I am just getting the sum of each line that had numbers. I have been battling with different ways to do this assignment and this is the closest I have gotten to getting it right.

I know I am missing some key parts but I am not sure what I am doing wrong.

Here is my code:

import re
text = open('text_numbers.txt')

for line in text:
    line = line.strip()
    y = re.findall('([0-9]+)',line)

    if len(y) > 0:
        print sum(map(int, y))

The result I get is something like this
(each is a sum of a line):

14151

8107

16997

18305

3866

And it needs to be one sum like this (sum of all numbers):

134058

Asked By: MacLovin

||

Answers:

import re
import np
text = open('text_numbers.txt')
final = []
for line in text:
    line = line.strip()
    y = re.findall('([0-9]+)',line)

    if len(y) > 0:
         lineVal = sum(map(int, y))
         final.append(lineVal)
         print "line sum = {0}".format(lineVal)
 print "Final sum = {0}".format(np.sum(final))

Is that what you’re looking for?

Answered By: Ryan Cheale
import re
text = open('text_numbers.txt')
data=text.read()
print sum(map(int,re.findall(r"bd+b",data)))

Use .read to get content in string format

Answered By: vks
import re
sample = open ('text_numbers.txt')
total =0
dignum = 0 

for line in sample:
    line = line.rstrip()
    dig= re.findall('[0-9]+', line)

    if len(dig) >0:
        dignum += len(dig)
        linetotal= sum(map(int, dig))
        total += linetotal

print 'The number of digits are:  ' 
print dignum
print 'The sum is: '
print total     
print 'The sum ends with: '
print  total % 1000
Answered By: Max

I dont know much python but I can give a simple solution.
Try this

import re
hand = open('text_numbers.txt')
x=list()
for line in hand:
    y=re.findall('[0-9]+',line)
    x=x+y
sum=0
for i in x:
    sum=sum + int(i)
print sum
Answered By: Tuhin
import re
print sum([int(i) for i in re.findall('[0-9]+',open(raw_input('What is the file you want to analyze?n'),'r').read())])

You can compact it into one line, but this is only for fun!

Answered By: Beer_Hammer

My first attempt to answer with the use of regular expressions, I find it a great skill to practise, that reading other’s code.

import re # import regular expressions
chuck_text = open("regex_sum_286723.txt")
numbers = []
Total = 0
for line in chuck_text:
    nmbrs = re.findall('[0-9]+', line)
    numbers = numbers + nmbrs 
for n in numbers:
    Total = Total + float(n)
print "Total = ", Total 

and thanx to Beer for the ‘comprehension list’ one liner, though his ‘r’ seems not needed, not sure what it does. But it reads beautifully, I get more confused reading two lots of loops like my answer

import re
print sum([int(i) for i in re.findall('[0-9]+',open("regex_sum_286723.txt").read())])
Answered By: Andrew Church

Here is my solution to this problem.

import re

file = open('text_numbers.txt')
sum = 0 

for line in file:
    line = line.rstrip()
    line = re.findall('([0-9]+)', line)
    for i in line:
        i = int(i)
        sum += i    

print(sum)

The line elements in first for loop are the lists also and I used second for loop to convert its elements to integer from string so I can sum them.

Answered By: Phoneix
import re

fl=open('regex_sum_7469.txt')
ls=[]

for x in fl: #create a list in the list
   x=x.rstrip()
   print x
   t= re.findall('[0-9]+',x) #all numbers
   for d in t: #for loop as there a empthy values in the list a
        ls.append(int(d))
print (sum(ls))
Answered By: Machine Learning XL

Here is my code:

f = open('regex_sum_text.txt', 'r').read().strip()
y = re.findall('[0-9]+', f)
l = [int(s) for s in y]
s = sum(l)
print(s)

another shorter way is:

with open('regex_sum_text.txt', 'r') as f:
    total = sum(map(int, re.findall(r'[0-9]+', f.read())))

print(total)
Answered By: Md. Jamal Uddin
import re
print(sum(int(value) for value in re.findall('[0-9]+', open('regex_sum_1128122.txt').read())))
Answered By: Rodrigo Kreis

this is how I solved it

import re

hand = open("regex_sum_1778498.txt")
x=list()
for line in hand:
    y = re.findall('[0-9]+',line)
if len(y)>1:
    x=x+y

out=list()
    for value in x:
    out.append(float(value))
print(sum(out))
Answered By: Areeba Yousuf
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.