Splitting a string by list of indices

Question:

I want to split a string by a list of indices, where the split segments begin with one index and end before the next one.

Example:

s = 'long string that I want to split up'
indices = [0,5,12,17]
parts = [s[index:] for index in indices]
for part in parts:
    print part

This will return:

long string that I want to split up
string that I want to split up
that I want to split up
I want to split up

I’m trying to get:

long
string
that
I want to split up

Asked By: Yarin

||

Answers:

s = 'long string that I want to split up'
indices = [0,5,12,17]
parts = [s[i:j] for i,j in zip(indices, indices[1:]+[None])]

returns

['long ', 'string ', 'that ', 'I want to split up']

which you can print using:

print 'n'.join(parts)

Another possibility (without copying indices) would be:

s = 'long string that I want to split up'
indices = [0,5,12,17]
indices.append(None)
parts = [s[indices[i]:indices[i+1]] for i in xrange(len(indices)-1)]
Answered By: eumiro

Here is a short solution with heavy usage of the itertools module. The tee function is used to iterate pairwise over the indices. See the Recipe section in the module for more help.

>>> from itertools import tee, izip_longest
>>> s = 'long string that I want to split up'
>>> indices = [0,5,12,17]
>>> start, end = tee(indices)
>>> next(end)
0
>>> [s[i:j] for i,j in izip_longest(start, end)]
['long ', 'string ', 'that ', 'I want to split up']

Edit: This is a version that does not copy the indices list, so it should be faster.

Answered By: schlamar

You can write a generator if you don’t want to make any modifications to the list of indices:

>>> def split_by_idx(S, list_of_indices):
...     left, right = 0, list_of_indices[0]
...     yield S[left:right]
...     left = right
...     for right in list_of_indices[1:]:
...         yield S[left:right]
...         left = right
...     yield S[left:]
... 
>>> 
>>> 
>>> s = 'long string that I want to split up'
>>> indices = [5,12,17]
>>> [i for i in split_by_idx(s, indices)]
['long ', 'string ', 'that ', 'I want to split up']
Answered By: Zhou Shao

Another solution (a bit more readable):

parts=[]; i2=len(s)  #--> i1 and i2 are 'startIndex' and 'endIndex'

for i1 in reversed(indices): parts.append( s[i1:i2] );  i2=i1

parts.reverse()

This reverses the indices and therefore starts splitting from the last index position to the ‘endIndex’ i2 (which is updated in every loop).

Of course the elements are in the wrong order than. That’s why I reversed the result array at the end.

I think for beginners this is a bit more readable than the accepted answer.

Answered By: Manny_user3297459
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.