Efficiency of using a Python list as a queue

Question:

A coworker recently wrote a program in which he used a Python list as a queue. In other words, he used .append(x) when needing to insert items and .pop(0) when needing to remove items.

I know that Python has collections.deque and I’m trying to figure out whether to spend my (limited) time to rewrite this code to use it. Assuming that we perform millions of appends and pops but never have more than a few thousand entries, will his list usage be a problem?

In particular, will the underlying array used by the Python list implementation continue to grow indefinitely have millions of spots even though the list only has a thousand things, or will Python eventually do a realloc and free up some of that memory?

Asked By: Eli Courtwright

||

Answers:

it sounds like a bit of empirical testing might be the best thing to do here – second order issues might make one approach better in practice, even if it’s not better in theory.

Answered By: Peter

You won’t run out of memory using the list implementation, but performance will be poor. From the docs:

Though list objects support similar
operations, they are optimized for
fast fixed-length operations and incur
O(n) memory movement costs for
pop(0) and insert(0, v) operations
which change both the size and
position of the underlying data
representation.

So using a deque will be much faster.

Answered By: John Millikin

Every .pop(0) takes N steps, since the list has to be reorganized. The required memory will not grow endlessly and only be as big as required for the items that are held.

I’d recommend using deque to get O(1) append and pop from front.

Answered By: bayer

Some answers claimed a “10x” speed advantage for deque vs list-used-as-FIFO when both have 1000 entries, but that’s a bit of an overbid:

$ python -mtimeit -s'q=range(1000)' 'q.append(23); q.pop(0)'
1000000 loops, best of 3: 1.24 usec per loop
$ python -mtimeit -s'import collections; q=collections.deque(range(1000))' 'q.append(23); q.popleft()'
1000000 loops, best of 3: 0.573 usec per loop

python -mtimeit is your friend — a really useful and simple micro-benchmarking approach! With it you can of course also trivially explore performance in much-smaller cases:

$ python -mtimeit -s'q=range(100)' 'q.append(23); q.pop(0)'
1000000 loops, best of 3: 0.972 usec per loop
$ python -mtimeit -s'import collections; q=collections.deque(range(100))' 'q.append(23); q.popleft()'
1000000 loops, best of 3: 0.576 usec per loop

(not very different for 12 instead of 100 items btw), and in much-larger ones:

$ python -mtimeit -s'q=range(10000)' 'q.append(23); q.pop(0)'
100000 loops, best of 3: 5.81 usec per loop
$ python -mtimeit -s'import collections; q=collections.deque(range(10000))' 'q.append(23); q.popleft()'
1000000 loops, best of 3: 0.574 usec per loop

You can see that the claim of O(1) performance for deque is well founded, while a list is over twice as slow around 1,000 items, an order of magnitude around 10,000. You can also see that even in such cases you’re only wasting 5 microseconds or so per append/pop pair and decide how significant that wastage is (though if that’s all you’re doing with that container, deque has no downside, so you might as well switch even if 5 usec more or less won’t make an important difference).

Answered By: Alex Martelli

From Beazley’s Python Essential Reference, Fourth Edition, p. 194:

Some library modules provide new types
that outperform the built-ins at
certain tasks. For instance,
collections.deque type provides
similar functionality to a list but
has been highly optimized for the
insertion of items at both ends. A
list, in contrast, is only efficient
when appending items at the end. If
you insert items at the front, all of
the other elements need to be shifted
in order to make room. The time
required to do this grows as the list
gets larger and larger. Just to give
you an idea of the difference, here is a timing measurement of inserting one million items at the front of a list and a deque:

And there follows this code sample:

>>> from timeit import timeit
>>> timeit('s.appendleft(37)', 'import collections; s = collections.deque()', number=1000000)
0.13162776274638258
>>> timeit('s.insert(0,37)', 's = []', number=1000000)
932.07849908298408

Timings are from my machine.


2012-07-01 Update

>>> from timeit import timeit
>>> n = 1024 * 1024
>>> while n > 1:
...     print '-' * 30, n
...     timeit('s.appendleft(37)', 'import collections; s = collections.deque()', number=n)
...     timeit('s.insert(0,37)', 's = []', number=n)
...     n >>= 1
... 
------------------------------ 1048576
0.1239769458770752
799.2552740573883
------------------------------ 524288
0.06924104690551758
148.9747350215912
------------------------------ 262144
0.029170989990234375
35.077512979507446
------------------------------ 131072
0.013737916946411133
9.134140014648438
------------------------------ 65536
0.006711006164550781
1.8818109035491943
------------------------------ 32768
0.00327301025390625
0.48307204246520996
------------------------------ 16384
0.0016388893127441406
0.11021995544433594
------------------------------ 8192
0.0008249282836914062
0.028419017791748047
------------------------------ 4096
0.00044918060302734375
0.00740504264831543
------------------------------ 2048
0.00021195411682128906
0.0021741390228271484
------------------------------ 1024
0.00011205673217773438
0.0006101131439208984
------------------------------ 512
6.198883056640625e-05
0.00021386146545410156
------------------------------ 256
2.9087066650390625e-05
8.797645568847656e-05
------------------------------ 128
1.5974044799804688e-05
3.600120544433594e-05
------------------------------ 64
8.821487426757812e-06
1.9073486328125e-05
------------------------------ 32
5.0067901611328125e-06
1.0013580322265625e-05
------------------------------ 16
3.0994415283203125e-06
5.9604644775390625e-06
------------------------------ 8
3.0994415283203125e-06
5.0067901611328125e-06
------------------------------ 4
3.0994415283203125e-06
4.0531158447265625e-06
------------------------------ 2
2.1457672119140625e-06
2.86102294921875e-06
Answered By: hughdbrown
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.