Sum of strings extracted from text file using regex
Question:
I am just learning python and need some help for my class assignment.
I have a file with text and numbers in it. Some lines have from one to three numbers and others have no numbers at all.
I need to:
-
Extract numbers only from the file using regex
-
Find the sum of all the numbers
I used regex to extract out all the numbers. I am trying to get the total sum of all the numbers but I am just getting the sum of each line that had numbers. I have been battling with different ways to do this assignment and this is the closest I have gotten to getting it right.
I know I am missing some key parts but I am not sure what I am doing wrong.
Here is my code:
import re
text = open('text_numbers.txt')
for line in text:
line = line.strip()
y = re.findall('([0-9]+)',line)
if len(y) > 0:
print sum(map(int, y))
The result I get is something like this
(each is a sum of a line):
14151
8107
16997
18305
3866
And it needs to be one sum like this (sum of all numbers):
134058
Answers:
import re
import np
text = open('text_numbers.txt')
final = []
for line in text:
line = line.strip()
y = re.findall('([0-9]+)',line)
if len(y) > 0:
lineVal = sum(map(int, y))
final.append(lineVal)
print "line sum = {0}".format(lineVal)
print "Final sum = {0}".format(np.sum(final))
Is that what you’re looking for?
import re
text = open('text_numbers.txt')
data=text.read()
print sum(map(int,re.findall(r"bd+b",data)))
Use .read
to get content in string
format
import re
sample = open ('text_numbers.txt')
total =0
dignum = 0
for line in sample:
line = line.rstrip()
dig= re.findall('[0-9]+', line)
if len(dig) >0:
dignum += len(dig)
linetotal= sum(map(int, dig))
total += linetotal
print 'The number of digits are: '
print dignum
print 'The sum is: '
print total
print 'The sum ends with: '
print total % 1000
I dont know much python but I can give a simple solution.
Try this
import re
hand = open('text_numbers.txt')
x=list()
for line in hand:
y=re.findall('[0-9]+',line)
x=x+y
sum=0
for i in x:
sum=sum + int(i)
print sum
import re
print sum([int(i) for i in re.findall('[0-9]+',open(raw_input('What is the file you want to analyze?n'),'r').read())])
You can compact it into one line, but this is only for fun!
My first attempt to answer with the use of regular expressions, I find it a great skill to practise, that reading other’s code.
import re # import regular expressions
chuck_text = open("regex_sum_286723.txt")
numbers = []
Total = 0
for line in chuck_text:
nmbrs = re.findall('[0-9]+', line)
numbers = numbers + nmbrs
for n in numbers:
Total = Total + float(n)
print "Total = ", Total
and thanx to Beer for the ‘comprehension list’ one liner, though his ‘r’ seems not needed, not sure what it does. But it reads beautifully, I get more confused reading two lots of loops like my answer
import re
print sum([int(i) for i in re.findall('[0-9]+',open("regex_sum_286723.txt").read())])
Here is my solution to this problem.
import re
file = open('text_numbers.txt')
sum = 0
for line in file:
line = line.rstrip()
line = re.findall('([0-9]+)', line)
for i in line:
i = int(i)
sum += i
print(sum)
The line elements in first for loop are the lists also and I used second for loop to convert its elements to integer from string so I can sum them.
import re
fl=open('regex_sum_7469.txt')
ls=[]
for x in fl: #create a list in the list
x=x.rstrip()
print x
t= re.findall('[0-9]+',x) #all numbers
for d in t: #for loop as there a empthy values in the list a
ls.append(int(d))
print (sum(ls))
Here is my code:
f = open('regex_sum_text.txt', 'r').read().strip()
y = re.findall('[0-9]+', f)
l = [int(s) for s in y]
s = sum(l)
print(s)
another shorter way is:
with open('regex_sum_text.txt', 'r') as f:
total = sum(map(int, re.findall(r'[0-9]+', f.read())))
print(total)
import re
print(sum(int(value) for value in re.findall('[0-9]+', open('regex_sum_1128122.txt').read())))
this is how I solved it
import re
hand = open("regex_sum_1778498.txt")
x=list()
for line in hand:
y = re.findall('[0-9]+',line)
if len(y)>1:
x=x+y
out=list()
for value in x:
out.append(float(value))
print(sum(out))
I am just learning python and need some help for my class assignment.
I have a file with text and numbers in it. Some lines have from one to three numbers and others have no numbers at all.
I need to:
-
Extract numbers only from the file using regex
-
Find the sum of all the numbers
I used regex to extract out all the numbers. I am trying to get the total sum of all the numbers but I am just getting the sum of each line that had numbers. I have been battling with different ways to do this assignment and this is the closest I have gotten to getting it right.
I know I am missing some key parts but I am not sure what I am doing wrong.
Here is my code:
import re
text = open('text_numbers.txt')
for line in text:
line = line.strip()
y = re.findall('([0-9]+)',line)
if len(y) > 0:
print sum(map(int, y))
The result I get is something like this
(each is a sum of a line):
14151
8107
16997
18305
3866
And it needs to be one sum like this (sum of all numbers):
134058
import re
import np
text = open('text_numbers.txt')
final = []
for line in text:
line = line.strip()
y = re.findall('([0-9]+)',line)
if len(y) > 0:
lineVal = sum(map(int, y))
final.append(lineVal)
print "line sum = {0}".format(lineVal)
print "Final sum = {0}".format(np.sum(final))
Is that what you’re looking for?
import re
text = open('text_numbers.txt')
data=text.read()
print sum(map(int,re.findall(r"bd+b",data)))
Use .read
to get content in string
format
import re
sample = open ('text_numbers.txt')
total =0
dignum = 0
for line in sample:
line = line.rstrip()
dig= re.findall('[0-9]+', line)
if len(dig) >0:
dignum += len(dig)
linetotal= sum(map(int, dig))
total += linetotal
print 'The number of digits are: '
print dignum
print 'The sum is: '
print total
print 'The sum ends with: '
print total % 1000
I dont know much python but I can give a simple solution.
Try this
import re
hand = open('text_numbers.txt')
x=list()
for line in hand:
y=re.findall('[0-9]+',line)
x=x+y
sum=0
for i in x:
sum=sum + int(i)
print sum
import re
print sum([int(i) for i in re.findall('[0-9]+',open(raw_input('What is the file you want to analyze?n'),'r').read())])
You can compact it into one line, but this is only for fun!
My first attempt to answer with the use of regular expressions, I find it a great skill to practise, that reading other’s code.
import re # import regular expressions
chuck_text = open("regex_sum_286723.txt")
numbers = []
Total = 0
for line in chuck_text:
nmbrs = re.findall('[0-9]+', line)
numbers = numbers + nmbrs
for n in numbers:
Total = Total + float(n)
print "Total = ", Total
and thanx to Beer for the ‘comprehension list’ one liner, though his ‘r’ seems not needed, not sure what it does. But it reads beautifully, I get more confused reading two lots of loops like my answer
import re
print sum([int(i) for i in re.findall('[0-9]+',open("regex_sum_286723.txt").read())])
Here is my solution to this problem.
import re
file = open('text_numbers.txt')
sum = 0
for line in file:
line = line.rstrip()
line = re.findall('([0-9]+)', line)
for i in line:
i = int(i)
sum += i
print(sum)
The line elements in first for loop are the lists also and I used second for loop to convert its elements to integer from string so I can sum them.
import re
fl=open('regex_sum_7469.txt')
ls=[]
for x in fl: #create a list in the list
x=x.rstrip()
print x
t= re.findall('[0-9]+',x) #all numbers
for d in t: #for loop as there a empthy values in the list a
ls.append(int(d))
print (sum(ls))
Here is my code:
f = open('regex_sum_text.txt', 'r').read().strip()
y = re.findall('[0-9]+', f)
l = [int(s) for s in y]
s = sum(l)
print(s)
another shorter way is:
with open('regex_sum_text.txt', 'r') as f:
total = sum(map(int, re.findall(r'[0-9]+', f.read())))
print(total)
import re
print(sum(int(value) for value in re.findall('[0-9]+', open('regex_sum_1128122.txt').read())))
this is how I solved it
import re
hand = open("regex_sum_1778498.txt")
x=list()
for line in hand:
y = re.findall('[0-9]+',line)
if len(y)>1:
x=x+y
out=list()
for value in x:
out.append(float(value))
print(sum(out))