Extracting 2 digits numbers from string

Question:

I have a file which contains string, from every string I need to append to my list every 2 digit number. Here’s the file content:
https://pastebin.com/N6gHRaVA

I need to iterate every string and check if string on index[i] and on index[i+1] is digit, if yes, append those digits to list and slice the string from those 2 digits number,

for example the string:

string = ‘7469NMPLWX8384RXXOORHKLYBTVVXKKSRWEITLOCWNHNOAQIXO’
should work in this way:

  1. Okay I have found digit 74, add 74 to my list and slice the string from 74 to the end
  2. My string is now 69NMPLWX8384RXXOORHKLYBTVVXKKSRWEITLOCWNHNOAQIXO, I have found digit 69,add 69 to list and slice the string until I will find new 2-number digit.
    The problem is I always have error:
        if string[i].isdigit() and string[i+1].isdigit():
                               ~~~~~~^^^^^
IndexError: string index out of range
f = open("file.txt")
read = f.read().split()
f.close()
for string in read:
    l = list()
    i = 0
    print(string)
    while i<len(string):
        if string[i].isdigit() and string[i+1].isdigit():
            l.append(string[i] + string[i+1])
            string = string[i+2:]
            i = 0
        else:
            i+=1

My program stops at string in line 31, which is the string:
‘REDOHGMDPOXKFMHUDDOMLDYFAFYDLMODDUHMFKXOPDMGHODER5’

I have no idea how to do this slice iteration, and please, don’t use regex.

Asked By: archerwell32

||

Answers:

Your loop condition i len(string). If string is not empty, this equals a positive intiger, which is evaluated as True. Hence, you created an endless loop, that meets it’s end, when i gets greater then string length. Try this:

while i < len(string) -1:

EDITED:
Apparently, i didn’t notice which string gave you the error. As you check for i+1th element of string, when we star checking the last character, reaching for the next one gives an obvious error. So, there should be -1 in the condition.

Answered By: Ni3dzwi3dz

You’re going off the end of the string… Change:

 while i<len(string):

to:

 while i<len(string)-1:

And you should be fine.

If you were just looking at one character at a time, you could use your original while. The trick here is that you’re always looking at a char and also "one ahead" of the char. So you have to shorten your check by one iteration to prevent going past the last char to check.

Answered By: GaryMBloom

You could use recursion.
Here is what it would look like to deal with one of the strings.

Part of the code:

my_string = '7469NMPLWX8384RXXOORHKLYBTVVXKKSRWEITLOCWNHNOAQIXO'
result_list = []

def read_string(s):
    result = ""
    for i,j in enumerate(s):
        if i>0 and s[i-1].isdigit() and s[i].isdigit():
            result = s[i-1] + s[i]
            result_list.append(result)
            read_string(s[i+1:])
            break;
            
    return (result_list)        
     
# Call the read_string function
x = read_string(my_string) 
print(x)    

OUTPUT:

['74', '69', '83', '84']
Answered By: ScottC

You are not stopping at the correct spot. You can just change your while loop to loop to

while I < len(string) - 1:

If I may suggest a slightly cleaner approach, see below.

f = open("file.txt")
read = f.read().split()
f.close()
for string in read:
    l = list()
    i = 0
    print(string)
    while i < len(string) - 1:
        numCheck = i + 1 # You call it more than once. Set to var
        ltr = string[i] + string[numCheck] # no need to call this multiple times, just set to a var
        if ltr.isdigit():
            l.append(ltr)
            string = string[numCheck:]
            i = 0
        else:
            i += 1
        
print(l)

I changed your while loop to above and then put the calls you make more than once into a variable. Also since your list is initialized within the for loop you only keep the number from your last string if you want a list with all the numbers, simply move it out. Like so,

f = open("file.txt")
read = f.read().split()
f.close()
l = list()
for string in read:
    i = 0
    print(string)
    while i < len(string) - 1:
        numCheck = i + 1 # You call it more than once. Set to var
        ltr = string[i] + string[numCheck] # no need to call this multiple times, just set to a var
        if ltr.isdigit():
            l.append(ltr)
            string = string[numCheck:]
            i = 0
        else:
            i += 1
        
print(l)
Answered By: Michael Gathara
Categories: questions Tags: , , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.