Python iterator and zip

Question:

With x = [1,2,3,4], I can get an iterator from i = iter(x).

With this iterator, I can use zip function to create a tuple with two items.

>>> i = iter(x)
>>> zip(i,i)
[(1, 2), (3, 4)]

Even I can use this syntax to get the same results.

>>> zip(*[i] * 2)
[(1, 2), (3, 4)]

How does this work? How an iterator with zip(i,i) and zip(*[i] * 2) work?

Asked By: prosseek

||

Answers:

Every time you get an item from an iterator, it stays at that spot rather than “rewinding.” So zip(i, i) gets the first item from i, then the second item from i, and returns that as a tuple. It continues to do this for each available pair, until the iterator is exhausted.

zip(*[i]*2) creates a list of [i, i] by multiplying i by 2, then unpacks it with the * at the far left, which, in effect, sends two arguments i and i to zip, producing the same result as the first snippet.

Answered By: TigerhawkT3

An iterator is like a stream of items. You can only look at the items in the stream one at a time and you only ever have access to the first element. To look at something in the stream, you need to remove it from the stream and once you take something from the top of the stream, it’s gone from the stream for good.

When you call zip(i, i), zip first looks at the first stream and takes an item out. Then it looks at the second stream (which happens to be the same stream as the first one) and takes an item out. Then it makes a tuple out of those two items and repeats this over and over until there is nothing left in the stream.

Maybe it’s easier to see if I were to write the zip function in pure python (with only 2 arguments for simplicity). It would look something like1:

def zip(a, b):
    out = []
    try:
        while True:
            item1 = next(a)
            item2 = next(b)
            out.append((item1, item2))
    except StopIteration:
        return out

Now imagine the case that you are talking about where a and b are the same object. In that case, we just call next twice on the iterator (i in your example case) which will just take the first two items from i in sequence and pack them into a tuple.

Once we’ve understood why zip(i, i) behaves the way it does, zip(*([i] * 2)) isn’t too hard. Lets read the expression from the inside out…

[i] * 2

That just creates a new list (of length 2) where both of the elements are references to the iterator i. So it’s the same thing as zip(*[i, i]) (it’s just more convenient to write when you want to repeat something many more than 2 times). * unpacking is a common idiom in python and you can find more information in the python tutorial. The gist of it is that python takes the iterable and “unpacks” it as if each item of the iterable was a separate positional argument to the function. So:

zip(*[i, i])

does the same thing as:

zip(i, i)

And now Bob’s our uncle. We’ve just come full-circle since zip(i, i) is where this discussion started.

1This example code is definitely simplified more than just the afore-mentioned only accepting 2 arguments. For example, zip is probably going to call iter on the input arguments so that it works for any iterable (not just iterators), but this should be enough to get the point across…

Answered By: mgilson
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.