Finding the last data number recorded in the text using Python
Question:
I have a txt file of data that is recorded daily. The program runs every day and records the data it receives from the user and considers a number for each input data.
like this:
#1
data number 1
data
data
data
-------------
#2
data number 2
text
text
-------------
#3
data number 3
-------------
My problem is in numbering the data. For example, when I run the program to record a data in a txt file, the program should find the number of the last recorded data, add one to it and record my data.
But I can’t write the program to find the last data number.
I tried these:
Find "#" in text. List all numbers after hashtags and find the biggest number that can be the number of the last recorded data.
text_file = open(r'test.txt', 'r')
line = text_file.read().splitlines()
for Number in line:
hashtag = Number[Number.find('#')]
if hashtag == '#':
hashtag = Number[Number.find('#')+1]
hashtag = int(hashtag)
record_list.append(hashtag)
last_number = max(record_list)
But when I use hashtag = Number[Number.find('#')]
, even in the lines where there is no hashtag, it returns the first or last letters in that line as a hashtag.
And if the text file is empty, it gives the following error:
hashtag = Number[Number.find('#')]
~~~~~~^^^^^^^^^^^^^^^^^^
IndexError: string index out of range
How can I find the number of the last data and use it in saving the next data?
Answers:
Consider:
>>> s = "hello world"
>>> s[s.find('#')]
'd'
>>> s.find('#')
-1
If #
is not in the line, -1
is returned, which when we use as an index, returns the last character.
We can use regular expressions and a list comprehension as one approach to solving this. Iterate over the lines, selecting only those which match the pattern of a numbered line. We’ll match the number part, converting that to an int. We select the last one, which should be the highest number.
with open('test.txt' ,'r') as text_file:
next_number = [
int(m.group(1))
for x in text_file.read().splitlines()
if (m := re.match(r'^s*#(d+)s*$', x))
][-1] + 1
Or we can pass a generator expression to max
to ensure we get the highest number.
with open('test.txt' ,'r') as text_file:
next_number = max(
int(m.group(1))
for x in text_file.read().splitlines()
if (m := re.match(r'^s*#(d+)s*$', x))
) + 1
I have a txt file of data that is recorded daily. The program runs every day and records the data it receives from the user and considers a number for each input data.
like this:
#1
data number 1
data
data
data
-------------
#2
data number 2
text
text
-------------
#3
data number 3
-------------
My problem is in numbering the data. For example, when I run the program to record a data in a txt file, the program should find the number of the last recorded data, add one to it and record my data.
But I can’t write the program to find the last data number.
I tried these:
Find "#" in text. List all numbers after hashtags and find the biggest number that can be the number of the last recorded data.
text_file = open(r'test.txt', 'r')
line = text_file.read().splitlines()
for Number in line:
hashtag = Number[Number.find('#')]
if hashtag == '#':
hashtag = Number[Number.find('#')+1]
hashtag = int(hashtag)
record_list.append(hashtag)
last_number = max(record_list)
But when I use hashtag = Number[Number.find('#')]
, even in the lines where there is no hashtag, it returns the first or last letters in that line as a hashtag.
And if the text file is empty, it gives the following error:
hashtag = Number[Number.find('#')]
~~~~~~^^^^^^^^^^^^^^^^^^
IndexError: string index out of range
How can I find the number of the last data and use it in saving the next data?
Consider:
>>> s = "hello world"
>>> s[s.find('#')]
'd'
>>> s.find('#')
-1
If #
is not in the line, -1
is returned, which when we use as an index, returns the last character.
We can use regular expressions and a list comprehension as one approach to solving this. Iterate over the lines, selecting only those which match the pattern of a numbered line. We’ll match the number part, converting that to an int. We select the last one, which should be the highest number.
with open('test.txt' ,'r') as text_file:
next_number = [
int(m.group(1))
for x in text_file.read().splitlines()
if (m := re.match(r'^s*#(d+)s*$', x))
][-1] + 1
Or we can pass a generator expression to max
to ensure we get the highest number.
with open('test.txt' ,'r') as text_file:
next_number = max(
int(m.group(1))
for x in text_file.read().splitlines()
if (m := re.match(r'^s*#(d+)s*$', x))
) + 1