Extracting 2 digits numbers from string
Question:
I have a file which contains string, from every string I need to append to my list every 2 digit number. Here’s the file content:
https://pastebin.com/N6gHRaVA
I need to iterate every string and check if string on index[i] and on index[i+1] is digit, if yes, append those digits to list and slice the string from those 2 digits number,
for example the string:
string = ‘7469NMPLWX8384RXXOORHKLYBTVVXKKSRWEITLOCWNHNOAQIXO’
should work in this way:
- Okay I have found digit 74, add 74 to my list and slice the string from 74 to the end
- My string is now 69NMPLWX8384RXXOORHKLYBTVVXKKSRWEITLOCWNHNOAQIXO, I have found digit 69,add 69 to list and slice the string until I will find new 2-number digit.
The problem is I always have error:
if string[i].isdigit() and string[i+1].isdigit():
~~~~~~^^^^^
IndexError: string index out of range
f = open("file.txt")
read = f.read().split()
f.close()
for string in read:
l = list()
i = 0
print(string)
while i<len(string):
if string[i].isdigit() and string[i+1].isdigit():
l.append(string[i] + string[i+1])
string = string[i+2:]
i = 0
else:
i+=1
My program stops at string in line 31, which is the string:
‘REDOHGMDPOXKFMHUDDOMLDYFAFYDLMODDUHMFKXOPDMGHODER5’
I have no idea how to do this slice iteration, and please, don’t use regex.
Answers:
Your loop condition i len(string). If string is not empty, this equals a positive intiger, which is evaluated as True. Hence, you created an endless loop, that meets it’s end, when i gets greater then string length. Try this:
while i < len(string) -1:
EDITED:
Apparently, i didn’t notice which string gave you the error. As you check for i+1th element of string, when we star checking the last character, reaching for the next one gives an obvious error. So, there should be -1 in the condition.
You’re going off the end of the string… Change:
while i<len(string):
to:
while i<len(string)-1:
And you should be fine.
If you were just looking at one character at a time, you could use your original while
. The trick here is that you’re always looking at a char and also "one ahead" of the char. So you have to shorten your check by one iteration to prevent going past the last char to check.
You could use recursion.
Here is what it would look like to deal with one of the strings.
Part of the code:
my_string = '7469NMPLWX8384RXXOORHKLYBTVVXKKSRWEITLOCWNHNOAQIXO'
result_list = []
def read_string(s):
result = ""
for i,j in enumerate(s):
if i>0 and s[i-1].isdigit() and s[i].isdigit():
result = s[i-1] + s[i]
result_list.append(result)
read_string(s[i+1:])
break;
return (result_list)
# Call the read_string function
x = read_string(my_string)
print(x)
OUTPUT:
['74', '69', '83', '84']
You are not stopping at the correct spot. You can just change your while loop to loop to
while I < len(string) - 1:
If I may suggest a slightly cleaner approach, see below.
f = open("file.txt")
read = f.read().split()
f.close()
for string in read:
l = list()
i = 0
print(string)
while i < len(string) - 1:
numCheck = i + 1 # You call it more than once. Set to var
ltr = string[i] + string[numCheck] # no need to call this multiple times, just set to a var
if ltr.isdigit():
l.append(ltr)
string = string[numCheck:]
i = 0
else:
i += 1
print(l)
I changed your while loop to above and then put the calls you make more than once into a variable. Also since your list is initialized within the for loop you only keep the number from your last string if you want a list with all the numbers, simply move it out. Like so,
f = open("file.txt")
read = f.read().split()
f.close()
l = list()
for string in read:
i = 0
print(string)
while i < len(string) - 1:
numCheck = i + 1 # You call it more than once. Set to var
ltr = string[i] + string[numCheck] # no need to call this multiple times, just set to a var
if ltr.isdigit():
l.append(ltr)
string = string[numCheck:]
i = 0
else:
i += 1
print(l)
I have a file which contains string, from every string I need to append to my list every 2 digit number. Here’s the file content:
https://pastebin.com/N6gHRaVA
I need to iterate every string and check if string on index[i] and on index[i+1] is digit, if yes, append those digits to list and slice the string from those 2 digits number,
for example the string:
string = ‘7469NMPLWX8384RXXOORHKLYBTVVXKKSRWEITLOCWNHNOAQIXO’
should work in this way:
- Okay I have found digit 74, add 74 to my list and slice the string from 74 to the end
- My string is now 69NMPLWX8384RXXOORHKLYBTVVXKKSRWEITLOCWNHNOAQIXO, I have found digit 69,add 69 to list and slice the string until I will find new 2-number digit.
The problem is I always have error:
if string[i].isdigit() and string[i+1].isdigit():
~~~~~~^^^^^
IndexError: string index out of range
f = open("file.txt")
read = f.read().split()
f.close()
for string in read:
l = list()
i = 0
print(string)
while i<len(string):
if string[i].isdigit() and string[i+1].isdigit():
l.append(string[i] + string[i+1])
string = string[i+2:]
i = 0
else:
i+=1
My program stops at string in line 31, which is the string:
‘REDOHGMDPOXKFMHUDDOMLDYFAFYDLMODDUHMFKXOPDMGHODER5’
I have no idea how to do this slice iteration, and please, don’t use regex.
Your loop condition i len(string). If string is not empty, this equals a positive intiger, which is evaluated as True. Hence, you created an endless loop, that meets it’s end, when i gets greater then string length. Try this:
while i < len(string) -1:
EDITED:
Apparently, i didn’t notice which string gave you the error. As you check for i+1th element of string, when we star checking the last character, reaching for the next one gives an obvious error. So, there should be -1 in the condition.
You’re going off the end of the string… Change:
while i<len(string):
to:
while i<len(string)-1:
And you should be fine.
If you were just looking at one character at a time, you could use your original while
. The trick here is that you’re always looking at a char and also "one ahead" of the char. So you have to shorten your check by one iteration to prevent going past the last char to check.
You could use recursion.
Here is what it would look like to deal with one of the strings.
Part of the code:
my_string = '7469NMPLWX8384RXXOORHKLYBTVVXKKSRWEITLOCWNHNOAQIXO'
result_list = []
def read_string(s):
result = ""
for i,j in enumerate(s):
if i>0 and s[i-1].isdigit() and s[i].isdigit():
result = s[i-1] + s[i]
result_list.append(result)
read_string(s[i+1:])
break;
return (result_list)
# Call the read_string function
x = read_string(my_string)
print(x)
OUTPUT:
['74', '69', '83', '84']
You are not stopping at the correct spot. You can just change your while loop to loop to
while I < len(string) - 1:
If I may suggest a slightly cleaner approach, see below.
f = open("file.txt")
read = f.read().split()
f.close()
for string in read:
l = list()
i = 0
print(string)
while i < len(string) - 1:
numCheck = i + 1 # You call it more than once. Set to var
ltr = string[i] + string[numCheck] # no need to call this multiple times, just set to a var
if ltr.isdigit():
l.append(ltr)
string = string[numCheck:]
i = 0
else:
i += 1
print(l)
I changed your while loop to above and then put the calls you make more than once into a variable. Also since your list is initialized within the for loop you only keep the number from your last string if you want a list with all the numbers, simply move it out. Like so,
f = open("file.txt")
read = f.read().split()
f.close()
l = list()
for string in read:
i = 0
print(string)
while i < len(string) - 1:
numCheck = i + 1 # You call it more than once. Set to var
ltr = string[i] + string[numCheck] # no need to call this multiple times, just set to a var
if ltr.isdigit():
l.append(ltr)
string = string[numCheck:]
i = 0
else:
i += 1
print(l)