How do I convert a tuple of tuples to a one-dimensional list using list comprehension?
Question:
I have a tuple of tuples – for example:
tupleOfTuples = ((1, 2), (3, 4), (5,))
I want to convert this into a flat, one-dimensional list of all the elements in order:
[1, 2, 3, 4, 5]
I’ve been trying to accomplish this with list comprehension. But I can’t seem to figure it out. I was able to accomplish it with a for-each loop:
myList = []
for tuple in tupleOfTuples:
myList = myList + list(tuple)
But I feel like there must be a way to do this with a list comprehension.
A simple [list(tuple) for tuple in tupleOfTuples]
just gives you a list of lists, instead of individual elements. I thought I could perhaps build on this by using the unpacking operator to then unpack the list, like so:
[*list(tuple) for tuple in tupleOfTuples]
or
[*(list(tuple)) for tuple in tupleOfTuples]
… but that didn’t work. Any ideas? Or should I just stick to the loop?
Answers:
it’s typically referred to as flattening a nested structure.
>>> tupleOfTuples = ((1, 2), (3, 4), (5,))
>>> [element for tupl in tupleOfTuples for element in tupl]
[1, 2, 3, 4, 5]
Just to demonstrate efficiency:
>>> import timeit
>>> it = lambda: list(chain(*tupleOfTuples))
>>> timeit.timeit(it)
2.1475738355700913
>>> lc = lambda: [element for tupl in tupleOfTuples for element in tupl]
>>> timeit.timeit(lc)
1.5745135182887857
ETA: Please don’t use tuple
as a variable name, it shadows built-in.
I like using ‘reduce’ in this situation (this is what reduce made for!)
lot = ((1, 2), (3, 4), (5,))
print list(reduce(lambda t1, t2: t1 + t2, lot))
> [1,2,3,4,5]
You’re chaining the tuples together:
from itertools import chain
print list(chain(*listOfTuples))
Should be pretty readable if you’re familiar with itertools
, and without the explicit list
you even have your result in generator form.
Another solution using itertools.chain
>>> tupleOfTuples = ((1, 2), (3, 4), (5,))
>>> from itertools import chain
>>> [x for x in chain.from_iterable(tupleOfTuples)]
[1, 2, 3, 4, 5]
Most of these answers will only work for a single level of flattening. For a more comprehensive solution, try this (from http://rightfootin.blogspot.com/2006/09/more-on-python-flatten.html):
def flatten(l, ltypes=(list, tuple)):
ltype = type(l)
l = list(l)
i = 0
while i < len(l):
while isinstance(l[i], ltypes):
if not l[i]:
l.pop(i)
i -= 1
break
else:
l[i:i + 1] = l[i]
i += 1
return ltype(l)
For multilevel, and readable code:
def flatten(bla):
output = []
for item in bla:
output += flatten(item) if hasattr (item, "__iter__") or hasattr (item, "__len__") else [item]
return output
I could not get this to fit in one line (and remain readable, even by far)
Just use sum
if you don’t have a lot of tuples.
>>> tupleOfTuples = ((1, 2), (3, 4), (5,))
>>> sum(tupleOfTuples, ())
(1, 2, 3, 4, 5)
>>> list(sum(tupleOfTuples, ())) # if you really need a list
[1, 2, 3, 4, 5]
If you do have a lot of tuples, use list comprehension or chain.from_iterable
to prevent the quadratic behavior of sum
.
Micro-benchmarks:
-
Python 2.6
-
Long tuple of short tuples
$ python2.6 -m timeit -s 'tot = ((1, 2), )*500' '[element for tupl in tot for element in tupl]'
10000 loops, best of 3: 134 usec per loop
$ python2.6 -m timeit -s 'tot = ((1, 2), )*500' 'list(sum(tot, ()))'
1000 loops, best of 3: 1.1 msec per loop
$ python2.6 -m timeit -s 'tot = ((1, 2), )*500; from itertools import chain; ci = chain.from_iterable' 'list(ci(tot))'
10000 loops, best of 3: 60.1 usec per loop
$ python2.6 -m timeit -s 'tot = ((1, 2), )*500; from itertools import chain' 'list(chain(*tot))'
10000 loops, best of 3: 64.8 usec per loop
-
Short tuple of long tuples
$ python2.6 -m timeit -s 'tot = ((1, )*500, (2, )*500)' '[element for tupl in tot for element in tupl]'
10000 loops, best of 3: 65.6 usec per loop
$ python2.6 -m timeit -s 'tot = ((1, )*500, (2, )*500)' 'list(sum(tot, ()))'
100000 loops, best of 3: 16.9 usec per loop
$ python2.6 -m timeit -s 'tot = ((1, )*500, (2, )*500); from itertools import chain; ci = chain.from_iterable' 'list(ci(tot))'
10000 loops, best of 3: 25.8 usec per loop
$ python2.6 -m timeit -s 'tot = ((1, )*500, (2, )*500); from itertools import chain' 'list(chain(*tot))'
10000 loops, best of 3: 26.5 usec per loop
-
Python 3.1
-
Long tuple of short tuples
$ python3.1 -m timeit -s 'tot = ((1, 2), )*500' '[element for tupl in tot for element in tupl]'
10000 loops, best of 3: 121 usec per loop
$ python3.1 -m timeit -s 'tot = ((1, 2), )*500' 'list(sum(tot, ()))'
1000 loops, best of 3: 1.09 msec per loop
$ python3.1 -m timeit -s 'tot = ((1, 2), )*500; from itertools import chain; ci = chain.from_iterable' 'list(ci(tot))'
10000 loops, best of 3: 59.5 usec per loop
$ python3.1 -m timeit -s 'tot = ((1, 2), )*500; from itertools import chain' 'list(chain(*tot))'
10000 loops, best of 3: 63.2 usec per loop
-
Short tuple of long tuples
$ python3.1 -m timeit -s 'tot = ((1, )*500, (2, )*500)' '[element for tupl in tot for element in tupl]'
10000 loops, best of 3: 66.1 usec per loop
$ python3.1 -m timeit -s 'tot = ((1, )*500, (2, )*500)' 'list(sum(tot, ()))'
100000 loops, best of 3: 16.3 usec per loop
$ python3.1 -m timeit -s 'tot = ((1, )*500, (2, )*500); from itertools import chain; ci = chain.from_iterable' 'list(ci(tot))'
10000 loops, best of 3: 25.4 usec per loop
$ python3.1 -m timeit -s 'tot = ((1, )*500, (2, )*500); from itertools import chain' 'list(chain(*tot))'
10000 loops, best of 3: 25.6 usec per loop
Observation:
sum
is faster if the outer tuple is short.
list(chain.from_iterable(x))
is faster if the outer tuple is long.
I have a tuple of tuples – for example:
tupleOfTuples = ((1, 2), (3, 4), (5,))
I want to convert this into a flat, one-dimensional list of all the elements in order:
[1, 2, 3, 4, 5]
I’ve been trying to accomplish this with list comprehension. But I can’t seem to figure it out. I was able to accomplish it with a for-each loop:
myList = []
for tuple in tupleOfTuples:
myList = myList + list(tuple)
But I feel like there must be a way to do this with a list comprehension.
A simple [list(tuple) for tuple in tupleOfTuples]
just gives you a list of lists, instead of individual elements. I thought I could perhaps build on this by using the unpacking operator to then unpack the list, like so:
[*list(tuple) for tuple in tupleOfTuples]
or
[*(list(tuple)) for tuple in tupleOfTuples]
… but that didn’t work. Any ideas? Or should I just stick to the loop?
it’s typically referred to as flattening a nested structure.
>>> tupleOfTuples = ((1, 2), (3, 4), (5,))
>>> [element for tupl in tupleOfTuples for element in tupl]
[1, 2, 3, 4, 5]
Just to demonstrate efficiency:
>>> import timeit
>>> it = lambda: list(chain(*tupleOfTuples))
>>> timeit.timeit(it)
2.1475738355700913
>>> lc = lambda: [element for tupl in tupleOfTuples for element in tupl]
>>> timeit.timeit(lc)
1.5745135182887857
ETA: Please don’t use tuple
as a variable name, it shadows built-in.
I like using ‘reduce’ in this situation (this is what reduce made for!)
lot = ((1, 2), (3, 4), (5,))
print list(reduce(lambda t1, t2: t1 + t2, lot))
> [1,2,3,4,5]
You’re chaining the tuples together:
from itertools import chain
print list(chain(*listOfTuples))
Should be pretty readable if you’re familiar with itertools
, and without the explicit list
you even have your result in generator form.
Another solution using itertools.chain
>>> tupleOfTuples = ((1, 2), (3, 4), (5,))
>>> from itertools import chain
>>> [x for x in chain.from_iterable(tupleOfTuples)]
[1, 2, 3, 4, 5]
Most of these answers will only work for a single level of flattening. For a more comprehensive solution, try this (from http://rightfootin.blogspot.com/2006/09/more-on-python-flatten.html):
def flatten(l, ltypes=(list, tuple)):
ltype = type(l)
l = list(l)
i = 0
while i < len(l):
while isinstance(l[i], ltypes):
if not l[i]:
l.pop(i)
i -= 1
break
else:
l[i:i + 1] = l[i]
i += 1
return ltype(l)
For multilevel, and readable code:
def flatten(bla):
output = []
for item in bla:
output += flatten(item) if hasattr (item, "__iter__") or hasattr (item, "__len__") else [item]
return output
I could not get this to fit in one line (and remain readable, even by far)
Just use sum
if you don’t have a lot of tuples.
>>> tupleOfTuples = ((1, 2), (3, 4), (5,))
>>> sum(tupleOfTuples, ())
(1, 2, 3, 4, 5)
>>> list(sum(tupleOfTuples, ())) # if you really need a list
[1, 2, 3, 4, 5]
If you do have a lot of tuples, use list comprehension or chain.from_iterable
to prevent the quadratic behavior of sum
.
Micro-benchmarks:
-
Python 2.6
-
Long tuple of short tuples
$ python2.6 -m timeit -s 'tot = ((1, 2), )*500' '[element for tupl in tot for element in tupl]' 10000 loops, best of 3: 134 usec per loop $ python2.6 -m timeit -s 'tot = ((1, 2), )*500' 'list(sum(tot, ()))' 1000 loops, best of 3: 1.1 msec per loop $ python2.6 -m timeit -s 'tot = ((1, 2), )*500; from itertools import chain; ci = chain.from_iterable' 'list(ci(tot))' 10000 loops, best of 3: 60.1 usec per loop $ python2.6 -m timeit -s 'tot = ((1, 2), )*500; from itertools import chain' 'list(chain(*tot))' 10000 loops, best of 3: 64.8 usec per loop
-
Short tuple of long tuples
$ python2.6 -m timeit -s 'tot = ((1, )*500, (2, )*500)' '[element for tupl in tot for element in tupl]' 10000 loops, best of 3: 65.6 usec per loop $ python2.6 -m timeit -s 'tot = ((1, )*500, (2, )*500)' 'list(sum(tot, ()))' 100000 loops, best of 3: 16.9 usec per loop $ python2.6 -m timeit -s 'tot = ((1, )*500, (2, )*500); from itertools import chain; ci = chain.from_iterable' 'list(ci(tot))' 10000 loops, best of 3: 25.8 usec per loop $ python2.6 -m timeit -s 'tot = ((1, )*500, (2, )*500); from itertools import chain' 'list(chain(*tot))' 10000 loops, best of 3: 26.5 usec per loop
-
-
Python 3.1
-
Long tuple of short tuples
$ python3.1 -m timeit -s 'tot = ((1, 2), )*500' '[element for tupl in tot for element in tupl]' 10000 loops, best of 3: 121 usec per loop $ python3.1 -m timeit -s 'tot = ((1, 2), )*500' 'list(sum(tot, ()))' 1000 loops, best of 3: 1.09 msec per loop $ python3.1 -m timeit -s 'tot = ((1, 2), )*500; from itertools import chain; ci = chain.from_iterable' 'list(ci(tot))' 10000 loops, best of 3: 59.5 usec per loop $ python3.1 -m timeit -s 'tot = ((1, 2), )*500; from itertools import chain' 'list(chain(*tot))' 10000 loops, best of 3: 63.2 usec per loop
-
Short tuple of long tuples
$ python3.1 -m timeit -s 'tot = ((1, )*500, (2, )*500)' '[element for tupl in tot for element in tupl]' 10000 loops, best of 3: 66.1 usec per loop $ python3.1 -m timeit -s 'tot = ((1, )*500, (2, )*500)' 'list(sum(tot, ()))' 100000 loops, best of 3: 16.3 usec per loop $ python3.1 -m timeit -s 'tot = ((1, )*500, (2, )*500); from itertools import chain; ci = chain.from_iterable' 'list(ci(tot))' 10000 loops, best of 3: 25.4 usec per loop $ python3.1 -m timeit -s 'tot = ((1, )*500, (2, )*500); from itertools import chain' 'list(chain(*tot))' 10000 loops, best of 3: 25.6 usec per loop
-
Observation:
sum
is faster if the outer tuple is short.list(chain.from_iterable(x))
is faster if the outer tuple is long.