Grouping elements into a list

Question:

I want to group elements into a list of list based on the indexing
Starting with the first position in data, go until the next False. That’s a grouping. Continue until the last element.

data = ['a','b','c','d','e','f'] 
indexer = [True, True, False, False, True, True]

Outcome would be:

[['a','b','c'], ['d'], ['e','f'] ]

Is itertools groupby the right solution? I’m a little confused about how to implement it.

Asked By: Peter

||

Answers:

Use accumulate then groupby

from itertools import groupby, accumulate
from operator import itemgetter

data = ['a', 'b', 'c', 'd', 'e', 'f']
indexer = [True, True, False, False, True, True]


groups = accumulate((not b for b in indexer), initial=0)
res = [[v for _, v in vs] for k, vs in groupby(zip(groups, data), key=itemgetter(0))]
print(res)

Output

[['a', 'b', 'c'], ['d'], ['e', 'f']]

In your particular example the variable groups is equivalent to:

[0, 0, 0, 1, 2, 2, 2]  # print(list(groups))

the idea is change the group id every time you encounter a False value, hence the need to negate it.

As an alternative you could use a variation on @Matiiss idea (all credit to him):

res, end = [], True
for d, i in zip(data, indexer):
    if end:
        res.append([])
    res[-1].append(d)
    end = not i

print(res)

Note: In Python you can directly sum booleans because they are integers.

Answered By: Dani Mesejo

You can simply append values to a temporary list and when you reach a False, create a new temporary list, first appending the last one to the resulting list, so basically, create a list after each False, lastly if necessary append the last temporary list to the result:

data = ['a', 'b', 'c', 'd', 'e', 'f']
indexer = [True, True, False, False, True, True]

result, temp = [], []
for value, index in zip(data, indexer):
    temp.append(value)
    if not index:
        result.append(temp)
        temp = []
if temp:
    result.append(temp)

print(result)
# [['a', 'b', 'c'], ['d'], ['e', 'f']]
Answered By: Matiiss

Variation of Dani’s without itemgetter, instead grouping the pure group numbers and zipping with the data (iterator) later (Try it online!):

from itertools import groupby, accumulate

data = ['a', 'b', 'c', 'd', 'e', 'f']
indexer = [True, True, False, False, True, True]

data = iter(data)
groups = accumulate((not b for b in indexer), initial=0)
res = [[d for _, d in zip(vs, data)] for _, vs in groupby(groups)]
print(res)

Two more ways using that shift-the-indexer-so-we-split-before-False idea (Try it online!):

res = []
for d, i in zip(data, [False] + indexer):
    if not i:
        res.append(r := [])
    r.append(d)
res = [
    r := [d]
    for d, i in zip(data, [False] + indexer)
    if not i or r.append(d)
]
Answered By: Kelly Bundy

A simple approach is to iterate over indexer using enumerate():

groups, start = [], 0
for end, boolVal in enumerate(indexer):
    if not boolVal:
        groups.append(data[start:end + 1])
        start = end + 1
if start < len(data):
    groups.append(data[start:])

Output:

[['a', 'b', 'c'], ['d'], ['e', 'f']]
Answered By: constantstranger
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.