How to find index positions of a substring using Python

Question:

Very new to Python here, and struggling. Any help is appreciated! Confession: this is obviously a request for help with homework, but my course ends tomorrow and the instructor takes too long to return a message, so I’m afraid if I wait I won’t get this finished in time.

I’m using a learning module from Cornell University called introcs. It’s documented here: http://cs1110.cs.cornell.edu/docs/index.html

I am trying to write a function that returns a tuple of all indexes of a substring within a string. I feel like I’m pretty close, but just not quite getting it. Here’s my code:


import introcs 

def findall(text,sub):
    result = ()
    x = 0
    pos = introcs.find_str(text,sub,x)

    for i in range(len(text)):
        if introcs.find_str(text,sub,x) != -1:
            result = result + (introcs.find_str(text,sub,x), )
            x = x + 1 + introcs.find_str(text,sub,x)

    return result

On the call findall(‘how now brown cow’, ‘ow’) I want it to return (1, 5, 10, 15) but instead it lops off the last result and returns (1, 5, 10) instead.

Any pointers would be really appreciated!

Asked By: garet

||

Answers:

You can use re to do it:

import re

found = [i.start() for i in re.finditer(substring, string)]
Answered By: walker

You don’t need to loop over all the characters in text. Just keep calling introcs.find_str() until it can’t find the substring and returns -1.

Your calculation of the new value of x is wrong. It should just be 1 more than the index of the previous match.

Make result a list rather than a tuple so you can use append() to add to it. If you really need to return a tuple you can use return tuple(result) at the end to convert it.

def findall(text,sub):
    result = []
    x = 0
    while True:
        pos = introcs.find_str(text,sub,x)
        if pos == -1:
            break
        result.append(pos)
        x = pos + 1

    return result
Answered By: Barmar

Your code shows evidence of three separate attempts of keeping track of where you are in the string:

  1. you loop over it with i
  2. you put the position a sub was found at in pos
  3. you compute an x

The question here is what do you want to happen in this case:

findall('abababa', 'aba')

Do you expect [0, 4] or [0, 2, 4] as a result? Assuming find_str works just like the standard str.find() and you want the [0, 2, 4] result, you can just start the next search at 1 position after the previously found position, and start searching at the start of the string. Also, instead of adding tuples together, why not build a list:

# this replaces your import, since we don't have access to it
class introcs:
    @staticmethod
    def find_str(text, sub, x):
        # assuming find_str is the same as str.find()
        return text.find(sub, x)


def findall(text,sub):
    result = []
    pos = -1

    while True:
        pos = introcs.find_str(text, sub, pos + 1)
        if pos == -1:
            break
        result.append(pos)

    return result


print(findall('abababa', 'aba'))

Output:

[0, 2, 4]

If you only want to match each character once, this works instead:

def findall(text,sub):
    result = []
    pos = -len(sub)

    while True:
        pos = introcs.find_str(text, sub, pos + len(sub))
        if pos == -1:
            break
        result.append(pos)

    return result

Output:

[0, 4]
Answered By: Grismar
Categories: questions Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.