How to get the Cartesian product of multiple lists
Question:
How can I get the Cartesian product (every possible combination of values) from a group of lists?
For example, given
somelists = [
[1, 2, 3],
['a', 'b'],
[4, 5]
]
How do I get this?
[(1, 'a', 4), (1, 'a', 5), (1, 'b', 4), (1, 'b', 5), (2, 'a', 4), (2, 'a', 5), ...]
One common application for this technique is to avoid deeply nested loops. See Avoiding nested for loops for a more specific duplicate. Similarly, this technique might be used to "explode" a dictionary with list values; see Combine Python Dictionary Permutations into List of Dictionaries .
If you want a Cartesian product of the same list with itself multiple times, itertools.product
can handle that elegantly. See Operation on every pair of element in a list or How can I get "permutations with repetitions" from a list (Cartesian product of a list with itself)?.
Many people who already know about itertools.product
struggle with the fact that it expects separate arguments for each input sequence, rather than e.g. a list of lists. The accepted answer shows how to handle this with *
. However, the use of *
here to unpack arguments is fundamentally not different from any other time it’s used in a function call. Please see Expanding tuples into arguments for this topic (and use that instead to close duplicate questions, as appropriate).
Answers:
Use itertools.product
, which has been available since Python 2.6.
import itertools
somelists = [
[1, 2, 3],
['a', 'b'],
[4, 5]
]
for element in itertools.product(*somelists):
print(element)
This is the same as:
for element in itertools.product([1, 2, 3], ['a', 'b'], [4, 5]):
print(element)
import itertools
>>> for i in itertools.product([1,2,3],['a','b'],[4,5]):
... print i
...
(1, 'a', 4)
(1, 'a', 5)
(1, 'b', 4)
(1, 'b', 5)
(2, 'a', 4)
(2, 'a', 5)
(2, 'b', 4)
(2, 'b', 5)
(3, 'a', 4)
(3, 'a', 5)
(3, 'b', 4)
(3, 'b', 5)
>>>
With itertools.product:
import itertools
result = list(itertools.product(*somelists))
In Python 2.6 and above, you can use ‘itertools.product`. In older versions of Python you can use the following (almost — see documentation) equivalent code from the documentation, at least as a starting point:
def product(*args, **kwds):
# product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
# product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
pools = map(tuple, args) * kwds.get('repeat', 1)
result = [[]]
for pool in pools:
result = [x+[y] for x in result for y in pool]
for prod in result:
yield tuple(prod)
The result of both is an iterator, so if you really need a list for further processing, use list(result)
.
For Python 2.5 and older:
>>> [(a, b, c) for a in [1,2,3] for b in ['a','b'] for c in [4,5]]
[(1, 'a', 4), (1, 'a', 5), (1, 'b', 4), (1, 'b', 5), (2, 'a', 4),
(2, 'a', 5), (2, 'b', 4), (2, 'b', 5), (3, 'a', 4), (3, 'a', 5),
(3, 'b', 4), (3, 'b', 5)]
Here’s a recursive version of product()
(just an illustration):
def product(*args):
if not args:
return iter(((),)) # yield tuple()
return (items + (item,)
for items in product(*args[:-1]) for item in args[-1])
Example:
>>> list(product([1,2,3], ['a','b'], [4,5]))
[(1, 'a', 4), (1, 'a', 5), (1, 'b', 4), (1, 'b', 5), (2, 'a', 4),
(2, 'a', 5), (2, 'b', 4), (2, 'b', 5), (3, 'a', 4), (3, 'a', 5),
(3, 'b', 4), (3, 'b', 5)]
>>> list(product([1,2,3]))
[(1,), (2,), (3,)]
>>> list(product([]))
[]
>>> list(product())
[()]
Here is a recursive generator, which doesn’t store any temporary lists
def product(ar_list):
if not ar_list:
yield ()
else:
for a in ar_list[0]:
for prod in product(ar_list[1:]):
yield (a,)+prod
print list(product([[1,2],[3,4],[5,6]]))
Output:
[(1, 3, 5), (1, 3, 6), (1, 4, 5), (1, 4, 6), (2, 3, 5), (2, 3, 6), (2, 4, 5), (2, 4, 6)]
Just to add a bit to what has already been said: if you use SymPy, you can use symbols rather than strings which makes them mathematically useful.
import itertools
import sympy
x, y = sympy.symbols('x y')
somelist = [[x,y], [1,2,3], [4,5]]
somelist2 = [[1,2], [1,2,3], [4,5]]
for element in itertools.product(*somelist):
print element
I would use list comprehension:
somelists = [
[1, 2, 3],
['a', 'b'],
[4, 5]
]
cart_prod = [(a,b,c) for a in somelists[0] for b in somelists[1] for c in somelists[2]]
A minor modification to the above recursive generator solution in variadic flavor:
def product_args(*args):
if args:
for a in args[0]:
for prod in product_args(*args[1:]) if args[1:] else ((),):
yield (a,) + prod
And of course a wrapper which makes it work exactly the same as that solution:
def product2(ar_list):
"""
>>> list(product(()))
[()]
>>> list(product2(()))
[]
"""
return product_args(*ar_list)
with one trade-off: it checks if recursion should break upon each outer loop, and one gain: no yield upon empty call, e.g.product(())
, which I suppose would be semantically more correct (see the doctest).
Regarding list comprehension: the mathematical definition applies to an arbitrary number of arguments, while list comprehension could only deal with a known number of them.
Although there are many answers already, I would like to share some of my thoughts:
Iterative approach
def cartesian_iterative(pools):
result = [[]]
for pool in pools:
result = [x+[y] for x in result for y in pool]
return result
Recursive Approach
def cartesian_recursive(pools):
if len(pools) > 2:
pools[0] = product(pools[0], pools[1])
del pools[1]
return cartesian_recursive(pools)
else:
pools[0] = product(pools[0], pools[1])
del pools[1]
return pools
def product(x, y):
return [xx + [yy] if isinstance(xx, list) else [xx] + [yy] for xx in x for yy in y]
Lambda Approach
def cartesian_reduct(pools):
return reduce(lambda x,y: product(x,y) , pools)
Recursive Approach:
def rec_cart(start, array, partial, results):
if len(partial) == len(array):
results.append(partial)
return
for element in array[start]:
rec_cart(start+1, array, partial+[element], results)
rec_res = []
some_lists = [[1, 2, 3], ['a', 'b'], [4, 5]]
rec_cart(0, some_lists, [], rec_res)
print(rec_res)
Iterative Approach:
def itr_cart(array):
results = [[]]
for i in range(len(array)):
temp = []
for res in results:
for element in array[i]:
temp.append(res+[element])
results = temp
return results
some_lists = [[1, 2, 3], ['a', 'b'], [4, 5]]
itr_res = itr_cart(some_lists)
print(itr_res)
I believe this works:
def cartesian_product(L):
if L:
return {(a,) + b for a in L[0]
for b in cartesian_product(L[1:])}
else:
return {()}
You can use itertools.product
in the standard library to get the Cartesian product. Other cool, related utilities in itertools
include permutations
, combinations
, and combinations_with_replacement
. Here is a link to a Python CodePen for the snippet below:
from itertools import product
somelists = [
[1, 2, 3],
['a', 'b'],
[4, 5]
]
result = list(product(*somelists))
print(result)
The following code is a 95% copy from Using NumPy to build an array of all combinations of two arrays; all credits go there! This is said to be much faster since it is only in NumPy.
import numpy as np
def cartesian(arrays, dtype=None, out=None):
arrays = [np.asarray(x) for x in arrays]
if dtype is None:
dtype = arrays[0].dtype
n = np.prod([x.size for x in arrays])
if out is None:
out = np.zeros([n, len(arrays)], dtype=dtype)
m = int(n / arrays[0].size)
out[:,0] = np.repeat(arrays[0], m)
if arrays[1:]:
cartesian(arrays[1:], out=out[0:m, 1:])
for j in range(1, arrays[0].size):
out[j*m:(j+1)*m, 1:] = out[0:m, 1:]
return out
You need to define the dtype as a parameter if you do not want to take the dtype from the first entry for all entries. Take dtype = ‘object’ if you have letters and numbers as items. Test:
somelists = [
[1, 2, 3],
['a', 'b'],
[4, 5]
]
[tuple(x) for x in cartesian(somelists, 'object')]
Out:
[(1, 'a', 4),
(1, 'a', 5),
(1, 'b', 4),
(1, 'b', 5),
(2, 'a', 4),
(2, 'a', 5),
(2, 'b', 4),
(2, 'b', 5),
(3, 'a', 4),
(3, 'a', 5),
(3, 'b', 4),
(3, 'b', 5)]
This can be done as
[(x, y) for x in range(10) for y in range(10)]
another variable? No problem:
[(x, y, z) for x in range(10) for y in range(10) for z in range(10)]
List comprehension is simple and clean:
import itertools
somelists = [
[1, 2, 3],
['a', 'b'],
[4, 5]
]
lst = [i for i in itertools.product(*somelists)]
In 99% of cases you should use itertools.product. It is written in efficient C code, so it is probably going to be better than any custom implementation.
In the 1% of cases that you need a Python-only algorithm (for example, if you need to modify it somehow), you can use the code below.
def product(*args, repeat=1):
"""Find the Cartesian product of the arguments.
The interface is identical to itertools.product.
"""
# Initialize data structures and handle bad input
if len(args) == 0:
yield () # Match behavior of itertools.product
return
gears = [tuple(arg) for arg in args] * repeat
for gear in gears:
if len(gear) == 0:
return
tooth_numbers = [0] * len(gears)
result = [gear[0] for gear in gears]
# Rotate through all gears
last_gear_number = len(gears) - 1
finished = False
while not finished:
yield tuple(result)
# Get next result
gear_number = last_gear_number
while gear_number >= 0:
gear = gears[gear_number]
tooth_number = tooth_numbers[gear_number] + 1
if tooth_number < len(gear):
# No gear change is necessary, so exit the loop
result[gear_number] = gear[tooth_number]
tooth_numbers[gear_number] = tooth_number
break
result[gear_number] = gear[0]
tooth_numbers[gear_number] = 0
gear_number -= 1
else:
# We changed all the gears, so we are back at the beginning
finished = True
The interface is the same as for itertools.product. For example:
>>> list(product((1, 2), "ab"))
[(1, 'a'), (1, 'b'), (2, 'a'), (2, 'b')]
This algorithm has the following advantages over other Python-only solutions on this page:
- It does not build up intermediate results in memory, keeping the memory footprint small.
- It uses iteration instead of recursion, meaning you will not get "maximum recursion depth exceeded" errors.
- It can accept any number of input iterables, making it more flexible than using nested for loops.
This code is based on the itertools.product algorithm from PyPy, which is released under the MIT licence.
If you want to reimplement it yourself, you can try with recursion. Something as simple as:
def product(cats, prefix = ()):
if not cats:
yield prefix
else:
head, *tail = cats
for cat in head:
yield from product(tail, prefix + (cat,))
is a working start.
The recursion depth is how many lists of categories you have.
How can I get the Cartesian product (every possible combination of values) from a group of lists?
For example, given
somelists = [
[1, 2, 3],
['a', 'b'],
[4, 5]
]
How do I get this?
[(1, 'a', 4), (1, 'a', 5), (1, 'b', 4), (1, 'b', 5), (2, 'a', 4), (2, 'a', 5), ...]
One common application for this technique is to avoid deeply nested loops. See Avoiding nested for loops for a more specific duplicate. Similarly, this technique might be used to "explode" a dictionary with list values; see Combine Python Dictionary Permutations into List of Dictionaries .
If you want a Cartesian product of the same list with itself multiple times, itertools.product
can handle that elegantly. See Operation on every pair of element in a list or How can I get "permutations with repetitions" from a list (Cartesian product of a list with itself)?.
Many people who already know about itertools.product
struggle with the fact that it expects separate arguments for each input sequence, rather than e.g. a list of lists. The accepted answer shows how to handle this with *
. However, the use of *
here to unpack arguments is fundamentally not different from any other time it’s used in a function call. Please see Expanding tuples into arguments for this topic (and use that instead to close duplicate questions, as appropriate).
Use itertools.product
, which has been available since Python 2.6.
import itertools
somelists = [
[1, 2, 3],
['a', 'b'],
[4, 5]
]
for element in itertools.product(*somelists):
print(element)
This is the same as:
for element in itertools.product([1, 2, 3], ['a', 'b'], [4, 5]):
print(element)
import itertools
>>> for i in itertools.product([1,2,3],['a','b'],[4,5]):
... print i
...
(1, 'a', 4)
(1, 'a', 5)
(1, 'b', 4)
(1, 'b', 5)
(2, 'a', 4)
(2, 'a', 5)
(2, 'b', 4)
(2, 'b', 5)
(3, 'a', 4)
(3, 'a', 5)
(3, 'b', 4)
(3, 'b', 5)
>>>
With itertools.product:
import itertools
result = list(itertools.product(*somelists))
In Python 2.6 and above, you can use ‘itertools.product`. In older versions of Python you can use the following (almost — see documentation) equivalent code from the documentation, at least as a starting point:
def product(*args, **kwds):
# product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
# product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
pools = map(tuple, args) * kwds.get('repeat', 1)
result = [[]]
for pool in pools:
result = [x+[y] for x in result for y in pool]
for prod in result:
yield tuple(prod)
The result of both is an iterator, so if you really need a list for further processing, use list(result)
.
For Python 2.5 and older:
>>> [(a, b, c) for a in [1,2,3] for b in ['a','b'] for c in [4,5]]
[(1, 'a', 4), (1, 'a', 5), (1, 'b', 4), (1, 'b', 5), (2, 'a', 4),
(2, 'a', 5), (2, 'b', 4), (2, 'b', 5), (3, 'a', 4), (3, 'a', 5),
(3, 'b', 4), (3, 'b', 5)]
Here’s a recursive version of product()
(just an illustration):
def product(*args):
if not args:
return iter(((),)) # yield tuple()
return (items + (item,)
for items in product(*args[:-1]) for item in args[-1])
Example:
>>> list(product([1,2,3], ['a','b'], [4,5]))
[(1, 'a', 4), (1, 'a', 5), (1, 'b', 4), (1, 'b', 5), (2, 'a', 4),
(2, 'a', 5), (2, 'b', 4), (2, 'b', 5), (3, 'a', 4), (3, 'a', 5),
(3, 'b', 4), (3, 'b', 5)]
>>> list(product([1,2,3]))
[(1,), (2,), (3,)]
>>> list(product([]))
[]
>>> list(product())
[()]
Here is a recursive generator, which doesn’t store any temporary lists
def product(ar_list):
if not ar_list:
yield ()
else:
for a in ar_list[0]:
for prod in product(ar_list[1:]):
yield (a,)+prod
print list(product([[1,2],[3,4],[5,6]]))
Output:
[(1, 3, 5), (1, 3, 6), (1, 4, 5), (1, 4, 6), (2, 3, 5), (2, 3, 6), (2, 4, 5), (2, 4, 6)]
Just to add a bit to what has already been said: if you use SymPy, you can use symbols rather than strings which makes them mathematically useful.
import itertools
import sympy
x, y = sympy.symbols('x y')
somelist = [[x,y], [1,2,3], [4,5]]
somelist2 = [[1,2], [1,2,3], [4,5]]
for element in itertools.product(*somelist):
print element
I would use list comprehension:
somelists = [
[1, 2, 3],
['a', 'b'],
[4, 5]
]
cart_prod = [(a,b,c) for a in somelists[0] for b in somelists[1] for c in somelists[2]]
A minor modification to the above recursive generator solution in variadic flavor:
def product_args(*args):
if args:
for a in args[0]:
for prod in product_args(*args[1:]) if args[1:] else ((),):
yield (a,) + prod
And of course a wrapper which makes it work exactly the same as that solution:
def product2(ar_list):
"""
>>> list(product(()))
[()]
>>> list(product2(()))
[]
"""
return product_args(*ar_list)
with one trade-off: it checks if recursion should break upon each outer loop, and one gain: no yield upon empty call, e.g.product(())
, which I suppose would be semantically more correct (see the doctest).
Regarding list comprehension: the mathematical definition applies to an arbitrary number of arguments, while list comprehension could only deal with a known number of them.
Although there are many answers already, I would like to share some of my thoughts:
Iterative approach
def cartesian_iterative(pools):
result = [[]]
for pool in pools:
result = [x+[y] for x in result for y in pool]
return result
Recursive Approach
def cartesian_recursive(pools):
if len(pools) > 2:
pools[0] = product(pools[0], pools[1])
del pools[1]
return cartesian_recursive(pools)
else:
pools[0] = product(pools[0], pools[1])
del pools[1]
return pools
def product(x, y):
return [xx + [yy] if isinstance(xx, list) else [xx] + [yy] for xx in x for yy in y]
Lambda Approach
def cartesian_reduct(pools):
return reduce(lambda x,y: product(x,y) , pools)
Recursive Approach:
def rec_cart(start, array, partial, results):
if len(partial) == len(array):
results.append(partial)
return
for element in array[start]:
rec_cart(start+1, array, partial+[element], results)
rec_res = []
some_lists = [[1, 2, 3], ['a', 'b'], [4, 5]]
rec_cart(0, some_lists, [], rec_res)
print(rec_res)
Iterative Approach:
def itr_cart(array):
results = [[]]
for i in range(len(array)):
temp = []
for res in results:
for element in array[i]:
temp.append(res+[element])
results = temp
return results
some_lists = [[1, 2, 3], ['a', 'b'], [4, 5]]
itr_res = itr_cart(some_lists)
print(itr_res)
I believe this works:
def cartesian_product(L):
if L:
return {(a,) + b for a in L[0]
for b in cartesian_product(L[1:])}
else:
return {()}
You can use itertools.product
in the standard library to get the Cartesian product. Other cool, related utilities in itertools
include permutations
, combinations
, and combinations_with_replacement
. Here is a link to a Python CodePen for the snippet below:
from itertools import product
somelists = [
[1, 2, 3],
['a', 'b'],
[4, 5]
]
result = list(product(*somelists))
print(result)
The following code is a 95% copy from Using NumPy to build an array of all combinations of two arrays; all credits go there! This is said to be much faster since it is only in NumPy.
import numpy as np
def cartesian(arrays, dtype=None, out=None):
arrays = [np.asarray(x) for x in arrays]
if dtype is None:
dtype = arrays[0].dtype
n = np.prod([x.size for x in arrays])
if out is None:
out = np.zeros([n, len(arrays)], dtype=dtype)
m = int(n / arrays[0].size)
out[:,0] = np.repeat(arrays[0], m)
if arrays[1:]:
cartesian(arrays[1:], out=out[0:m, 1:])
for j in range(1, arrays[0].size):
out[j*m:(j+1)*m, 1:] = out[0:m, 1:]
return out
You need to define the dtype as a parameter if you do not want to take the dtype from the first entry for all entries. Take dtype = ‘object’ if you have letters and numbers as items. Test:
somelists = [
[1, 2, 3],
['a', 'b'],
[4, 5]
]
[tuple(x) for x in cartesian(somelists, 'object')]
Out:
[(1, 'a', 4),
(1, 'a', 5),
(1, 'b', 4),
(1, 'b', 5),
(2, 'a', 4),
(2, 'a', 5),
(2, 'b', 4),
(2, 'b', 5),
(3, 'a', 4),
(3, 'a', 5),
(3, 'b', 4),
(3, 'b', 5)]
This can be done as
[(x, y) for x in range(10) for y in range(10)]
another variable? No problem:
[(x, y, z) for x in range(10) for y in range(10) for z in range(10)]
List comprehension is simple and clean:
import itertools
somelists = [
[1, 2, 3],
['a', 'b'],
[4, 5]
]
lst = [i for i in itertools.product(*somelists)]
In 99% of cases you should use itertools.product. It is written in efficient C code, so it is probably going to be better than any custom implementation.
In the 1% of cases that you need a Python-only algorithm (for example, if you need to modify it somehow), you can use the code below.
def product(*args, repeat=1):
"""Find the Cartesian product of the arguments.
The interface is identical to itertools.product.
"""
# Initialize data structures and handle bad input
if len(args) == 0:
yield () # Match behavior of itertools.product
return
gears = [tuple(arg) for arg in args] * repeat
for gear in gears:
if len(gear) == 0:
return
tooth_numbers = [0] * len(gears)
result = [gear[0] for gear in gears]
# Rotate through all gears
last_gear_number = len(gears) - 1
finished = False
while not finished:
yield tuple(result)
# Get next result
gear_number = last_gear_number
while gear_number >= 0:
gear = gears[gear_number]
tooth_number = tooth_numbers[gear_number] + 1
if tooth_number < len(gear):
# No gear change is necessary, so exit the loop
result[gear_number] = gear[tooth_number]
tooth_numbers[gear_number] = tooth_number
break
result[gear_number] = gear[0]
tooth_numbers[gear_number] = 0
gear_number -= 1
else:
# We changed all the gears, so we are back at the beginning
finished = True
The interface is the same as for itertools.product. For example:
>>> list(product((1, 2), "ab"))
[(1, 'a'), (1, 'b'), (2, 'a'), (2, 'b')]
This algorithm has the following advantages over other Python-only solutions on this page:
- It does not build up intermediate results in memory, keeping the memory footprint small.
- It uses iteration instead of recursion, meaning you will not get "maximum recursion depth exceeded" errors.
- It can accept any number of input iterables, making it more flexible than using nested for loops.
This code is based on the itertools.product algorithm from PyPy, which is released under the MIT licence.
If you want to reimplement it yourself, you can try with recursion. Something as simple as:
def product(cats, prefix = ()):
if not cats:
yield prefix
else:
head, *tail = cats
for cat in head:
yield from product(tail, prefix + (cat,))
is a working start.
The recursion depth is how many lists of categories you have.