# Transpose/Unzip Function (inverse of zip)?

## Question:

I have a list of 2-item tuples and I’d like to convert them to 2 lists where the first contains the first item in each tuple and the second list holds the second item.

**For example:**

```
original = [('a', 1), ('b', 2), ('c', 3), ('d', 4)]
# and I want to become...
result = (['a', 'b', 'c', 'd'], [1, 2, 3, 4])
```

Is there a builtin function that does that?

## Answers:

In 2.x, `zip`

is its own inverse! Provided you use the special * operator.

```
>>> zip(*[('a', 1), ('b', 2), ('c', 3), ('d', 4)])
[('a', 'b', 'c', 'd'), (1, 2, 3, 4)]
```

This is equivalent to calling `zip`

with each element of the list as a separate argument:

```
zip(('a', 1), ('b', 2), ('c', 3), ('d', 4))
```

except the arguments are passed to `zip`

directly (after being converted to a tuple), so there’s no need to worry about the number of arguments getting too big.

In 3.x, `zip`

returns a lazy iterator, but this is trivially converted:

```
>>> list(zip(*[('a', 1), ('b', 2), ('c', 3), ('d', 4)]))
[('a', 'b', 'c', 'd'), (1, 2, 3, 4)]
```

You could also do

```
result = ([ a for a,b in original ], [ b for a,b in original ])
```

It *should* scale better. Especially if Python makes good on not expanding the list comprehensions unless needed.

(Incidentally, it makes a 2-tuple (pair) of lists, rather than a list of tuples, like `zip`

does.)

If generators instead of actual lists are ok, this would do that:

```
result = (( a for a,b in original ), ( b for a,b in original ))
```

The generators don’t munch through the list until you ask for each element, but on the other hand, they do keep references to the original list.

If you have lists that are not the same length, you may not want to use zip as per Patricks answer. This works:

```
>>> zip(*[('a', 1), ('b', 2), ('c', 3), ('d', 4)])
[('a', 'b', 'c', 'd'), (1, 2, 3, 4)]
```

But with different length lists, zip truncates each item to the length of the shortest list:

```
>>> zip(*[('a', 1), ('b', 2), ('c', 3), ('d', 4), ('e', )])
[('a', 'b', 'c', 'd', 'e')]
```

You can use map with no function to fill empty results with None:

```
>>> map(None, *[('a', 1), ('b', 2), ('c', 3), ('d', 4), ('e', )])
[('a', 'b', 'c', 'd', 'e'), (1, 2, 3, 4, None)]
```

zip() is marginally faster though.

I like to use `zip(*iterable)`

(which is the piece of code you’re looking for) in my programs as so:

```
def unzip(iterable):
return zip(*iterable)
```

I find `unzip`

more readable.

It’s only another way to do it but it helped me a lot so I write it here:

Having this data structure:

```
X=[1,2,3,4]
Y=['a','b','c','d']
XY=zip(X,Y)
```

Resulting in:

```
In: XY
Out: [(1, 'a'), (2, 'b'), (3, 'c'), (4, 'd')]
```

The more pythonic way to unzip it and go back to the original is this one in my opinion:

```
x,y=zip(*XY)
```

But this return a tuple so if you need a list you can use:

```
x,y=(list(x),list(y))
```

To get a tuple of lists, as in the question:

```
>>> original = [('a', 1), ('b', 2), ('c', 3), ('d', 4)]
>>> tuple([list(tup) for tup in zip(*original)])
(['a', 'b', 'c', 'd'], [1, 2, 3, 4])
```

To unpack the two lists into separate variables:

```
list1, list2 = [list(tup) for tup in zip(*original)]
```

Since it returns tuples (and can use tons of memory), the `zip(*zipped)`

trick seems more clever than useful, to me.

Here’s a function that will actually give you the inverse of zip.

```
def unzip(zipped):
"""Inverse of built-in zip function.
Args:
zipped: a list of tuples
Returns:
a tuple of lists
Example:
a = [1, 2, 3]
b = [4, 5, 6]
zipped = list(zip(a, b))
assert zipped == [(1, 4), (2, 5), (3, 6)]
unzipped = unzip(zipped)
assert unzipped == ([1, 2, 3], [4, 5, 6])
"""
unzipped = ()
if len(zipped) == 0:
return unzipped
dim = len(zipped[0])
for i in range(dim):
unzipped = unzipped + ([tup[i] for tup in zipped], )
return unzipped
```

None of the previous answers *efficiently* provide the required output, which is a **tuple of lists**, rather than a *list of tuples*. For the former, you can use `tuple`

with `map`

. Here’s the difference:

```
res1 = list(zip(*original)) # [('a', 'b', 'c', 'd'), (1, 2, 3, 4)]
res2 = tuple(map(list, zip(*original))) # (['a', 'b', 'c', 'd'], [1, 2, 3, 4])
```

In addition, most of the previous solutions assume Python 2.7, where `zip`

returns a list rather than an iterator.

For Python 3.x, you will need to pass the result to a function such as `list`

or `tuple`

to exhaust the iterator. For memory-efficient iterators, you can omit the outer `list`

and `tuple`

calls for the respective solutions.

While `zip(*seq)`

is very useful, it may be unsuitable for very long sequences as it will create a tuple of values to be passed in. For example, I’ve been working with a coordinate system with over a million entries and find it signifcantly faster to create the sequences directly.

A generic approach would be something like this:

```
from collections import deque
seq = ((a1, b1, …), (a2, b2, …), …)
width = len(seq[0])
output = [deque(len(seq))] * width # preallocate memory
for element in seq:
for s, item in zip(output, element):
s.append(item)
```

But, depending on what you want to do with the result, the choice of collection can make a big difference. In my actual use case, using sets and no internal loop, is noticeably faster than all other approaches.

And, as others have noted, if you are doing this with datasets, it might make sense to use Numpy or Pandas collections instead.

# Naive approach

```
def transpose_finite_iterable(iterable):
return zip(*iterable) # `itertools.izip` for Python 2 users
```

works fine for finite iterable (e.g. sequences like `list`

/`tuple`

/`str`

) of (potentially infinite) iterables which can be illustrated like

```
| |a_00| |a_10| ... |a_n0| |
| |a_01| |a_11| ... |a_n1| |
| |... | |... | ... |... | |
| |a_0i| |a_1i| ... |a_ni| |
| |... | |... | ... |... | |
```

where

`n in ℕ`

,`a_ij`

corresponds to`j`

-th element of`i`

-th iterable,

and after applying `transpose_finite_iterable`

we get

```
| |a_00| |a_01| ... |a_0i| ... |
| |a_10| |a_11| ... |a_1i| ... |
| |... | |... | ... |... | ... |
| |a_n0| |a_n1| ... |a_ni| ... |
```

Python example of such case where `a_ij == j`

, `n == 2`

```
>>> from itertools import count
>>> iterable = [count(), count()]
>>> result = transpose_finite_iterable(iterable)
>>> next(result)
(0, 0)
>>> next(result)
(1, 1)
```

But we can’t use `transpose_finite_iterable`

again to return to structure of original `iterable`

because `result`

is an infinite iterable of finite iterables (`tuple`

s in our case):

```
>>> transpose_finite_iterable(result)
... hangs ...
Traceback (most recent call last):
File "...", line 1, in ...
File "...", line 2, in transpose_finite_iterable
MemoryError
```

So how can we deal with this case?

# … and here comes the `deque`

After we take a look at docs of `itertools.tee`

function, there is Python recipe that with some modification can help in our case

```
def transpose_finite_iterables(iterable):
iterator = iter(iterable)
try:
first_elements = next(iterator)
except StopIteration:
return ()
queues = [deque([element])
for element in first_elements]
def coordinate(queue):
while True:
if not queue:
try:
elements = next(iterator)
except StopIteration:
return
for sub_queue, element in zip(queues, elements):
sub_queue.append(element)
yield queue.popleft()
return tuple(map(coordinate, queues))
```

let’s check

```
>>> from itertools import count
>>> iterable = [count(), count()]
>>> result = transpose_finite_iterables(transpose_finite_iterable(iterable))
>>> result
(<generator object transpose_finite_iterables.<locals>.coordinate at ...>, <generator object transpose_finite_iterables.<locals>.coordinate at ...>)
>>> next(result[0])
0
>>> next(result[0])
1
```

# Synthesis

Now we can define general function for working with iterables of iterables ones of which are finite and another ones are potentially infinite using `functools.singledispatch`

decorator like

```
from collections import (abc,
deque)
from functools import singledispatch
@singledispatch
def transpose(object_):
"""
Transposes given object.
"""
raise TypeError('Unsupported object type: {type}.'
.format(type=type))
@transpose.register(abc.Iterable)
def transpose_finite_iterables(object_):
"""
Transposes given iterable of finite iterables.
"""
iterator = iter(object_)
try:
first_elements = next(iterator)
except StopIteration:
return ()
queues = [deque([element])
for element in first_elements]
def coordinate(queue):
while True:
if not queue:
try:
elements = next(iterator)
except StopIteration:
return
for sub_queue, element in zip(queues, elements):
sub_queue.append(element)
yield queue.popleft()
return tuple(map(coordinate, queues))
def transpose_finite_iterable(object_):
"""
Transposes given finite iterable of iterables.
"""
yield from zip(*object_)
try:
transpose.register(abc.Collection, transpose_finite_iterable)
except AttributeError:
# Python3.5-
transpose.register(abc.Mapping, transpose_finite_iterable)
transpose.register(abc.Sequence, transpose_finite_iterable)
transpose.register(abc.Set, transpose_finite_iterable)
```

which can be considered as its own inverse (mathematicians call this kind of functions “involutions”) in class of binary operators over finite non-empty iterables.

As a bonus of `singledispatch`

ing we can handle `numpy`

arrays like

```
import numpy as np
...
transpose.register(np.ndarray, np.transpose)
```

and then use it like

```
>>> array = np.arange(4).reshape((2,2))
>>> array
array([[0, 1],
[2, 3]])
>>> transpose(array)
array([[0, 2],
[1, 3]])
```

# Note

Since `transpose`

returns iterators and if someone wants to have a `tuple`

of `list`

s like in OP — this can be made additionally with `map`

built-in function like

```
>>> original = [('a', 1), ('b', 2), ('c', 3), ('d', 4)]
>>> tuple(map(list, transpose(original)))
(['a', 'b', 'c', 'd'], [1, 2, 3, 4])
```

# Advertisement

I’ve added generalized solution to `lz`

package from `0.5.0`

version which can be used like

```
>>> from lz.transposition import transpose
>>> list(map(tuple, transpose(zip(range(10), range(10, 20)))))
[(0, 1, 2, 3, 4, 5, 6, 7, 8, 9), (10, 11, 12, 13, 14, 15, 16, 17, 18, 19)]
```

# P.S.

There is no solution (at least obvious) for handling potentially infinite iterable of potentially infinite iterables, but this case is less common though.

Consider using more_itertools.unzip:

```
>>> from more_itertools import unzip
>>> original = [('a', 1), ('b', 2), ('c', 3), ('d', 4)]
>>> [list(x) for x in unzip(original)]
[['a', 'b', 'c', 'd'], [1, 2, 3, 4]]
```

While numpy arrays and pandas may be preferrable, this function imitates the behavior of `zip(*args)`

when called as `unzip(args)`

.

Allows for generators, like the result from `zip`

in Python 3, to be passed as `args`

as it iterates through values.

```
def unzip(items, cls=list, ocls=tuple):
"""Zip function in reverse.
:param items: Zipped-like iterable.
:type items: iterable
:param cls: Container factory. Callable that returns iterable containers,
with a callable append attribute, to store the unzipped items. Defaults
to ``list``.
:type cls: callable, optional
:param ocls: Outer container factory. Callable that returns iterable
containers. with a callable append attribute, to store the inner
containers (see ``cls``). Defaults to ``tuple``.
:type ocls: callable, optional
:returns: Unzipped items in instances returned from ``cls``, in an instance
returned from ``ocls``.
"""
# iter() will return the same iterator passed to it whenever possible.
items = iter(items)
try:
i = next(items)
except StopIteration:
return ocls()
unzipped = ocls(cls([v]) for v in i)
for i in items:
for c, v in zip(unzipped, i):
c.append(v)
return unzipped
```

To use list cointainers, simply run `unzip(zipped)`

, as

```
unzip(zip(["a","b","c"],[1,2,3])) == (["a","b","c"],[1,2,3])
```

To use deques, or other any container sporting `append`

, pass a factory function.

```
from collections import deque
unzip([("a",1),("b",2)], deque, list) == [deque(["a","b"]),deque([1,2])]
```

(Decorate `cls`

and/or `main_cls`

to micro manage container initialization, as briefly shown in the final assert statement above.)

Here’s a simple one-line answer that produces the desired output:

```
original = [('a', 1), ('b', 2), ('c', 3), ('d', 4)]
list(zip(*original))
# [('a', 'b', 'c', 'd'), (1, 2, 3, 4)]
```

Just to summarize:

```
# data
a = ('a', 'b', 'c', 'd')
b = (1, 2, 3, 4)
# forward
zipped = zip(a, b) # [('a', 1), ('b', 2), ('c', 3), ('d', 4)]
# reverse
a_, b_ = zip(*zipped)
# verify
assert a == a_
assert b == b_
```