Python itertools.groupby gives unexpected results when sorting by binary representation

Question:

I am trying to solve a simple problem using itertools.groupby: group the numbers from 0 to 7 according to the number of 1‘s in their binary representation. So I want to produce the mapping

{0: [0], 1: [1, 2, 4], 2: [3, 5, 6], 3: [7]}

But here is what I get from groupby:

>>> from itertools import groupby
>>> def key(i):
...     print(i, bin(i), bin(i).count('1'))
...     return bin(i).count('1')
>>> groups = {k: list(v) for k, v in groupby(range(8), key=key)}
0 0b0 0
1 0b1 1
2 0b10 1
3 0b11 2
4 0b100 1
5 0b101 2
6 0b110 2
7 0b111 3
>>> groups
{0: [0], 1: [4], 2: [5, 6], 3: [7]}

The result has me absolutely baffled. The print statements show that the individual calls to the key function behave as expected, and yet I loose the numbers 1, 2, 3 along the way.
It get’s even worse when I use e.g. 16:

>>> {k: list(v) for k, v in groupby(range(16), key=lambda i: bin(i).count('1'))}
{0: [0], 1: [8], 2: [12], 3: [13, 14], 4: [15]}

I am hoping to understand how groupby arrives at this result, and to learn if their is a way to solve this using itertools. (I am not looking for a solution to the problem as such, only for a fancy generator solution using e.g. itertools.)

(I’ve tried this in python 3.9 and 3.10 so I’m fairly certain it is not a bug)

Asked By: tBuLi

||

Answers:

If you want to use groupby you need to sort input list first.

groups = {k: list(v) for k, v in groupby(sorted(range(8), key=key), key=key)}

Your generator discards old entries when same group is encountered later.

You are already using dict so you don’t need to use groupby at all

d = defaultdict(list)
for i in range(8):
    d[key(i)].append(i)
Answered By: ekim boran