Python: how to split a list into an unknown number of smaller lists based on a delimeter

Question:

I’ve got a list which contains the following strings:

MainList
’00:00′
’00:01′
’00:02′
’00:03′
’00:04′
’00:00′
’00:01′
’00:02′
’00:03′
’00:04′

I would like to split this into a smaller number of lists whenever ’00:00′ is encountered since ’00:00′ is the only element that won’t change:

Desired output:
List1
’00:00′
’00:01′
’00:02′
’00:03′
’00:04′

List2
’00:00′
’00:01′
’00:02′
’00:03′
’00:04′

I tried looking at list slicing but the problem is that the last value and as such, number of elements may change. Moreover, I’m not sure how many smaller lists I’ll need (and how I’d dynamically create n number of smaller lists?)

Asked By: cbros2008

||

Answers:

In an explicit way, you could do like this :

sep = '00:00'
split_list = []
for item in Mainlist:
    if item == sep:
        split_list.append([item])
    else:
        split_list[-1].append(item)

print split_list
Answered By: Cédric Julien

I usually do this:

def splitby( lst, breaker='00:00'):
    current = []
    it = iter(lst)
    first = next(it)
    assert first==breaker, "`lst` must begin with `breaker`"
    current.append(first)
    for item in it:
        if item == breaker:
            yield current
            current = []
        current.append(item)
    yield current

The inevitable itertools solution is a bit more general:

from itertools import groupby

class splitter(object):
    
    def __init__(self, breaker):
        self.breaker = breaker
        self.current_group = 0
        
    def __call__(self, item):
        if item == self.breaker:
            self.current_group+=1
        return self.current_group
        
    def group(self, items):
        return (list(v) for k,v in groupby(items,self))
    
print list(splitter('00:00').group(items))
Answered By: Jochen Ritzel

Comprehensions is your best friend :). Just two lines:

>>> a=['00:00', '00:01', '00:02', '00:03', '00:00', '00:01', '00:02']
>>> found=[index for index,item in enumerate(a) if item=='00:00'] + [len(a)]
>>> [a[found[i]:found[i+1]] for i in range(len(found)-1)]
[['00:00', '00:01', '00:02', '00:03'], ['00:00', '00:01', '00:02']]

Here is what we do:

We search for delimiter positions and get a list which contains delimiter indexes:

>>> found=[index for index,item in enumerate(a) if item=='00:00']
>>> found
[0, 4]

We’re adding len(a) for including the last dict.

And creating new lists with splitting a with founded indexes :

>>> [a[found[i]:found[i+1]] for i in range(len(found)-1)]
[['00:00', '00:01', '00:02', '00:03'], ['00:00', '00:01', '00:02']]
Answered By: utdemir

I could think of another way 🙂

def list_split(a):
    #a=['00:00', '00:01', '00:02', '00:03', '00:00', '00:01', '00:02']
    output = []
    count = 0

    if len(a) < 1:
        output.append(a)
        return output

    for i, item in enumerate(a[1:]):
        if item == a[0]:
            output.append(a[count:i+1])
            count = i + 1
    else:
        output.append(a[count:])
        return output
Answered By: thiruvenkadam
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.