How to get all possible combinations of dividing a list of length n into m sublists

Question:

In my specific case I would like to get all possible combinations of dividing a list of length 20 into 4 sublists. The sublists may have any length from 1 to 17. As a result I would like to have a list of lists of sublists with all possible combinations.

List to start with:

list_init = [a, b,  c,  d,  e,  f,  g,  h,  i,  j,  k,  l,  m,  n,  o,  p,  q,  r,  s,  t]

List of lists of lists with all combinations:

list_combs = [
[[a],   [b],    [c],    [d, e,  f,  g,  h,  i,  j,  k,  l,  m,  n,  o,  p,  q,  r,  s,  t]],
[[a,d], [b],    [c],    [e, f,  g,  h,  i,  j,  k,  l,  m,  n,  o,  p,  q,  r,  s,  t]],
[[a,d,e],   [b],    [c],    [f, g,  h,  i,  j,  k,  l,  m,  n,  o,  p,  q,  r,  s,  t]],
.
.
.
]
Asked By: volfi

||

Answers:

You can do that recursively by combining a first part of length 1,2,3,… with the partitions in 3 of the rest (recursing to 2 and 1):

from itertools import combinations
def partCombo(L,N=4):
    if N==1: yield [L]; return
    for size in range(1,len(L)-N+2):
        for combo in combinations(range(len(L)),size): # index combinations
            part      = list(L[i] for i in combo)      # first part
            remaining = list(L)
            for i in reversed(combo): del remaining[i] # unused items
            yield from ([part]+rest for rest in partCombo(remaining,N-1))

output:

aList = list("abcdefg")
for part in partCombo(aList):
    print(part)

[['a'], ['b'], ['c'], ['d', 'e', 'f', 'g']]
[['a'], ['b'], ['d'], ['c', 'e', 'f', 'g']]
[['a'], ['b'], ['e'], ['c', 'd', 'f', 'g']]
[['a'], ['b'], ['f'], ['c', 'd', 'e', 'g']]
[['a'], ['b'], ['g'], ['c', 'd', 'e', 'f']]
[['a'], ['b'], ['c', 'd'], ['e', 'f', 'g']]
[['a'], ['b'], ['c', 'e'], ['d', 'f', 'g']]
[['a'], ['b'], ['c', 'f'], ['d', 'e', 'g']]
[['a'], ['b'], ['c', 'g'], ['d', 'e', 'f']]
[['a'], ['b'], ['d', 'e'], ['c', 'f', 'g']]
[['a'], ['b'], ['d', 'f'], ['c', 'e', 'g']]
[['a'], ['b'], ['d', 'g'], ['c', 'e', 'f']]
[['a'], ['b'], ['e', 'f'], ['c', 'd', 'g']]
[['a'], ['b'], ['e', 'g'], ['c', 'd', 'f']]
[['a'], ['b'], ['f', 'g'], ['c', 'd', 'e']]
[['a'], ['b'], ['c', 'd', 'e'], ['f', 'g']]
... and many more ... (total 8400)

For a list of 20 items, there will be 1,085,570,781,624 combinations.

from math import factorial
def countCombos(L,N=4):
    if N==1: return 1
    result = 0
    for size in range(1,L-N+2):
        c = factorial(L)//factorial(size)//factorial(L-size)
        c *= countCombos(L-size,N-1)
        result += c
    return result

sum(1 for _ in partCombo("abcdefg"))    # 8400
sum(1 for _ in partCombo("abcdefghij")) # 818520

countCombos(7)  # 8400 
countCombos(10) # 818520
countCombos(15) # 1016542800
countCombos(20) # 1085570781624

With that many combinations (of 20 items), it will be impossible to output a list of of all the combinations (it would never fit in memory). This is why the function is written as a generator. Still, it would take a long time (weeks) to go through all the combinations produced by the generator function on a list of 20 items.

Answered By: Alain T.

This is equivalent to deciding which list each element should go into (for example, by picking a number from 0 to 3, 20 times), building each list of lists, and rejecting the ones which contain empty lists.

So, we can write a generator like:

from itertools import product

def partitions_of(elements, list_count):
    element_count = len(elements)
    for list_ids in product(range(list_count), repeat=element_count):
        if len(set(list_ids)) != list_count:
            continue # this combination would have an empty list.
        result = [[] for _ in range(list_count)]
        for list_id, element in zip(list_ids, elements):
            result[list_id].append(element)
        yield result

Testing it:

>>> for p in partitions_of('abcd', 2):
...     print(p)
... 
[['a', 'b', 'c'], ['d']]
[['a', 'b', 'd'], ['c']]
[['a', 'b'], ['c', 'd']]
[['a', 'c', 'd'], ['b']]
[['a', 'c'], ['b', 'd']]
[['a', 'd'], ['b', 'c']]
[['a'], ['b', 'c', 'd']]
[['b', 'c', 'd'], ['a']]
[['b', 'c'], ['a', 'd']]
[['b', 'd'], ['a', 'c']]
[['b'], ['a', 'c', 'd']]
[['c', 'd'], ['a', 'b']]
[['c'], ['a', 'b', 'd']]
[['d'], ['a', 'b', 'c']]

As long as the number of input elements is large compared to the number of lists, relatively few candidates should get rejected. In the worst case, where there are as many lists as elements, such that we effectively permute the list and put each element in its own list, this will take O(N^N) time to compute O(N!) results, i.e., still exponentially slower (as implied by Stirling’s approximation) than it needs to be.

Answered By: Karl Knechtel
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.