How to find all unique combinations of k size tuple using a single element from each list of n lists

Question:

Given a list containing N sublists of multiple lengths, find all unique combinations of a k size, selecting only one element from each sublist.

  • The order of the elements in the combination is not relevant: (a, b) = (b, a)

sample_k = 2

sample_list = [['B1','B2','B3'], ['T1','T2'], ['L1','L2','L3','L4']]

expected_output =
[
('B1', 'T1'),('B1', 'T2'),('B1', 'L1'),('B1', 'L2'),('B1', 'L3'),('B1', 'L4'),
('B2', 'T1'),('B2', 'T2'),('B2', 'L1'),('B2', 'L2'),('B2', 'L3'),('B2', 'L4'),
('B3', 'T1'),('B3', 'T2'),('B3', 'L1'),('B3', 'L2'),('B3', 'L3'),('B3', 'L4'),
('T1', 'L1'),('T1', 'L2'),('T1', 'L3'),('T1', 'L4'),
('T2', 'L1'),('T2', 'L2'),('T2', 'L3'),('T2', 'L4')
]
  • Extra points for a pythonic way of doing it
  • Speed/Efficiency matters, the idea is to use in a list with hundreds of lists ranging from 5 to 50 in length

What I have been able to accomplish so far:
Using for and while loops to move pointers and build the answer, however I am having a hard time figuring out how to include K parameter to set the size of tuple combination dinamically. (not really happy about it)

def build_combinations(lst):
    result = []
    count_of_lst = len(lst)
    for i, sublist in enumerate(lst):
        if i == count_of_lst - 1:
            continue
        else:
            for item in sublist:
                j = 0
                while i < len(lst)-1:
                    while j <= len(lst[i+1])-1:
                        comb = (item, lst[i+1][j])
                        result.append(comb)
                        j = j + 1
                    i = i + 1
                    j = 0
                i = 0
    return result

I’ve seen many similar questions in stack overflow, but none of them addressed the parameters the way I am trying to (one item from each list, and the size of the combinations being a params of function)

I tried using itertools combinations, product, permutation and flipping them around without success. Whenever using itertools I have either a hard time using only one item from each list, or not being able to set the size of the tuple I need.

I tried NumPy using arrays and a more math/matrix approach, but didn’t go too far. There’s definitely a way of solving with NumPy, hence why I tagged numpy as well

Asked By: eric_308

||

Answers:

You need to combine two itertools helpers, combinations to select the two unique ordered lists to use, then product to combine the elements of the two:

from itertools import combinations, product

sample_k = 2

sample_list = [['B1','B2','B3'], ['T1','T2'], ['L1','L2','L3','L4']]

expected_output = [pair
                   for lists in combinations(sample_list, sample_k)
                   for pair in product(*lists)]
print(expected_output)

Try it online!

If you want to get really fancy/clever/ugly, you can push all the work down to the C layer with:

from itertools import combinations, product, starmap, chain

sample_k = 2

sample_list = [['B1','B2','B3'], ['T1','T2'], ['L1','L2','L3','L4']]

expected_output = list(chain.from_iterable(starmap(product, combinations(sample_list, sample_k))))
print(expected_output)

That will almost certainly run meaningfully faster for huge inputs (especially if you can loop the results from chain.from_iterable directly rather than realizing them as a list), but it’s probably not worth the ugliness unless you’re really tight for cycles (I wouldn’t expect much more than a 10% speed-up, but you’d need to benchmark to be sure).

Answered By: ShadowRanger
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.