How can I extract complete numbers (e.g. 10 instead of 1, 0) from a txt file and add them to a list
Question:
#Open the text file in read mode to extract integers
outfile = open(r'/Users/alfonsomartinezpetz/Desktop/Numbers_in_text.txt', 'r')
#List that will store the numbers
processed_list = []
#For loop that reads one line after the other
for line in outfile:
#Nested for loop to check for integers
for i in line:
#if statement that checks for numbers only
if i.isdigit() == True:
#Adds numbers to the list
processed_list += i
print(processed_list)
outfile.close()
#For example the txt file has this:
Donec sit amet ligula eu tellus venenatis 10 maximus vel vel lectus. Donec pretium risus eu odio semper, id placerat felis luctus. Maecenas commodo mauris vitae augue congue fermentum. Sed efficitur tincidunt elit, nec elementum orci tempus vel. Aliquam tempor ligula nisi. Proin non elit at lorem ornare faucibus a at turpis. Pellentesque 15 molestie aliquam quam vel faucibus.
My current method adds the numbers like so [1,0,1,5, …] I’m looking for [10, 15, …] preferably with no imports, maybe a sort of strip or split? I’ve tried those but it still nets the same list.
Answers:
You code adjusted to iterate through a list:
#Open the text file in read mode to extract integers
outfile = open(r'/Users/alfonsomartinezpetz/Desktop/Numbers_in_text.txt', 'r')
#List that will store the numbers
processed_list = []
#For loop that reads one line after the other
for line in outfile:
line = line.split(' ') #added
#Nested for loop to check for integers
for i in line:
#if statement that checks for numbers only
if i.isdigit() == True:
#Adds numbers to the list
processed_list += i
print(processed_list)
outfile.close()
Rather than pulling out each character and checking if it is a digit, pull out each word (separated by a " "
) and check those.
...
words = line.split(" ")
for word in words:
try:
processed_list.append(int(word))
except ValueError:
pass
You may also want to do some cleanup on word
in that loop in case it ends with a special character. int()
won’t work if it is '5.'
or '5,'
You could do
for line in outfile:
processed_list += [int(s) for s in line.split() if s.isdigit()]
However, this will only extract positive integers.
If you want to consider floats and/or negative numbers, I would look at using regex.
#Open the text file in read mode to extract integers
outfile = open(r'/Users/alfonsomartinezpetz/Desktop/Numbers_in_text.txt', 'r')
#List that will store the numbers
processed_list = []
#For loop that reads one line after the other
for line in outfile:
#Nested for loop to check for integers
for i in line:
#if statement that checks for numbers only
if i.isdigit() == True:
#Adds numbers to the list
processed_list += i
print(processed_list)
outfile.close()
#For example the txt file has this:
Donec sit amet ligula eu tellus venenatis 10 maximus vel vel lectus. Donec pretium risus eu odio semper, id placerat felis luctus. Maecenas commodo mauris vitae augue congue fermentum. Sed efficitur tincidunt elit, nec elementum orci tempus vel. Aliquam tempor ligula nisi. Proin non elit at lorem ornare faucibus a at turpis. Pellentesque 15 molestie aliquam quam vel faucibus.
My current method adds the numbers like so [1,0,1,5, …] I’m looking for [10, 15, …] preferably with no imports, maybe a sort of strip or split? I’ve tried those but it still nets the same list.
You code adjusted to iterate through a list:
#Open the text file in read mode to extract integers
outfile = open(r'/Users/alfonsomartinezpetz/Desktop/Numbers_in_text.txt', 'r')
#List that will store the numbers
processed_list = []
#For loop that reads one line after the other
for line in outfile:
line = line.split(' ') #added
#Nested for loop to check for integers
for i in line:
#if statement that checks for numbers only
if i.isdigit() == True:
#Adds numbers to the list
processed_list += i
print(processed_list)
outfile.close()
Rather than pulling out each character and checking if it is a digit, pull out each word (separated by a " "
) and check those.
...
words = line.split(" ")
for word in words:
try:
processed_list.append(int(word))
except ValueError:
pass
You may also want to do some cleanup on word
in that loop in case it ends with a special character. int()
won’t work if it is '5.'
or '5,'
You could do
for line in outfile:
processed_list += [int(s) for s in line.split() if s.isdigit()]
However, this will only extract positive integers.
If you want to consider floats and/or negative numbers, I would look at using regex.