What is the purpose of Python's itertools.repeat?

Question:

For every use I can think of for Python’s itertools.repeat() class, I can think of another equally (possibly more) acceptable solution to achieve the same effect. For example:

>>> [i for i in itertools.repeat('example', 5)]
['example', 'example', 'example', 'example', 'example']
>>> ['example'] * 5
['example', 'example', 'example', 'example', 'example']

>>> list(map(str.upper, itertools.repeat('example', 5)))
['EXAMPLE', 'EXAMPLE', 'EXAMPLE', 'EXAMPLE', 'EXAMPLE']
>>> ['example'.upper()] * 5
['EXAMPLE', 'EXAMPLE', 'EXAMPLE', 'EXAMPLE', 'EXAMPLE']

Is there any case in which itertools.repeat() would be the most appropriate solution? If so, under what circumstances?

Asked By: Tyler Crompton

||

Answers:

Your example of foo * 5 looks superficially similar to itertools.repeat(foo, 5), but it is actually quite different.

If you write foo * 100000, the interpreter must create 100,000 copies of foo before it can give you an answer. It is thus a very expensive and memory-unfriendly operation.

But if you write itertools.repeat(foo, 100000), the interpreter can return an iterator that serves the same function, and doesn’t need to compute a result until you need it — say, by using it in a function that wants to know each result in the sequence.

That’s the major advantage of iterators: they can defer the computation of a part (or all) of a list until you really need the answer.

Answered By: John Feminella

It’s an iterator. Big clue here: it’s in the itertools module. From the documentation you linked to:

itertools.repeat(object[, times])
Make an iterator that returns object over and over again. Runs indefinitely unless the times argument is specified.

So you won’t ever have all that stuff in memory. An example where you want to use it might be

n = 25
t = 0
for x in itertools.repeat(4):
    if t > n:
        print t
    else:
        t += x

as this will allow you an arbitrary number of 4s, or whatever you might need an infinite list of.

Answered By: machine yearning

The itertools.repeat function is lazy; it only uses the memory required for one item. On the other hand, the (a,) * n and [a] * n idioms create n copies of the object in memory. For five items, the multiplication idiom is probably better, but you might notice a resource problem if you had to repeat something, say, a million times.

Still, it is hard to imagine many static uses for itertools.repeat. However, the fact that itertools.repeat is a function allows you to use it in many functional applications. For example, you might have some library function func which operates on an iterable of input. Sometimes, you might have pre-constructed lists of various items. Other times, you may just want to operate on a uniform list. If the list is big, itertools.repeat will save you memory.

Finally, repeat makes possible the so-called “iterator algebra” described in the itertools documentation. Even the itertools module itself uses the repeat function. For example, the following code is given as an equivalent implementation of itertools.izip_longest (even though the real code is probably written in C). Note the use of repeat seven lines from the bottom:

class ZipExhausted(Exception):
    pass

def izip_longest(*args, **kwds):
    # izip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
    fillvalue = kwds.get('fillvalue')
    counter = [len(args) - 1]
    def sentinel():
        if not counter[0]:
            raise ZipExhausted
        counter[0] -= 1
        yield fillvalue
    fillers = repeat(fillvalue)
    iterators = [chain(it, sentinel(), fillers) for it in args]
    try:
        while iterators:
            yield tuple(map(next, iterators))
    except ZipExhausted:
        pass
Answered By: HardlyKnowEm

The primary purpose of itertools.repeat is to supply a stream of constant values to be used with map or zip:

>>> list(map(pow, range(10), repeat(2)))     # list of squares
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

The secondary purpose is that it gives a very fast way to loop a fixed number of times like this:

for _ in itertools.repeat(None, 10000):
    do_something()

This is faster than:

for i in range(10000):
    do_something().

The former wins because all it needs to do is update the reference count for the existing None object. The latter loses because the range() or xrange() needs to manufacture 10,000 distinct integer objects.

Note, Guido himself uses that fast looping technique in the timeit() module. See the source at https://hg.python.org/cpython/file/2.7/Lib/timeit.py#l195 :

    if itertools:
        it = itertools.repeat(None, number)
    else:
        it = [None] * number
    gcold = gc.isenabled()
    gc.disable()
    try:
        timing = self.inner(it, self.timer)
Answered By: Raymond Hettinger

As mentioned before, it works well with zip:

Another example:

from itertools import repeat

fruits = ['apples', 'oranges', 'bananas']

# Initialize inventory to zero for each fruit type.
inventory = dict( zip(fruits, repeat(0)) )

Result:

{'apples': 0, 'oranges': 0, 'bananas': 0}

To do this without repeat, I’d have to involve len(fruits).

Answered By: Jonathon Reinhart

I usually use repeat in conjunction with chain and cycle. Here is an example:

from itertools import chain,repeat,cycle

fruits = ['apples', 'oranges', 'bananas', 'pineapples','grapes',"berries"]

inventory = list(zip(fruits, chain(repeat(10,2),cycle(range(1,3)))))

print inventory

Puts the first 2 fruits as value 10, then it cycles the values 1 and 2 for the remaining fruits.

Answered By: Stefan Gruenwald
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.