deque.popleft() and list.pop(0). Is there performance difference?

Question:

deque.popleft() and list.pop(0) seem to return the same result. Is there any performance difference between them and why?

Asked By: Bin

||

Answers:

Yes, and it’s considerable if you have a long list or deque. All elements in a list are placed contiguously in memory, so if you remove any element, all subsequent elements must be shifted one position to the left – therefore, the time required to remove or insert an element at the start of a list is proportional to the length of the list. A deque, on the other hand, is specifically constructed to allow fast insertions or removal at either end (typically by allowing “empty” memory locations at the beginning of the deque, or to wrap around so that the end of the memory segment occupied by the deque can contain elements that are actually considered to be at the beginning of the deque).

Compare the performance of these two snippets:

d = deque([0] * 1000000)
while d:
    d.popleft()
    if len(d) % 100 == 0:
        print(len(d))

lst = [0] * 1000000
while lst:
    lst.pop(0)
    if len(lst) % 100 == 0:
        print(len(lst))
Answered By: Aasmund Eldhuset

Is there performance difference?

Yes. deque.popleft() is O(1) — a constant time operation. While list.pop(0) is O(n) — linear time operation: the larger the list the longer it takes.

Why?

CPython list implementation is array-based. pop(0) removes the first item from the list and it requires to shift left len(lst) - 1 items to fill the gap.

deque() implementation uses a doubly linked list. No matter how large the deque, deque.popleft() requires a constant (limited above) number of operations.

Answered By: jfs

deque.popleft() is faster than list.pop(0), because the deque has been optimized to do popleft() approximately in O(1), while list.pop(0) takes O(n) (see deque objects).

Comments and code in _collectionsmodule.c for deque and listobject.c for list provide implementation insights to explain the performance differences. Namely that a deque object “is composed of a doubly-linked list”, which effectively optimizes appends and pops at both ends, while list objects are not even singly-linked lists but C arrays (of pointers to elements (see Python 2.7 listobject.h#l22 and Python 3.5 listobject.h#l23), which makes them good for fast random access of elements but requires O(n) time to reposition all elements after removal of the first.

For Python 2.7 and 3.5, the URLs of these source code files are:

  1. https://hg.python.org/cpython/file/2.7/Modules/_collectionsmodule.c

  2. https://hg.python.org/cpython/file/2.7/Objects/listobject.c

  3. https://hg.python.org/cpython/file/3.5/Modules/_collectionsmodule.c

  4. https://hg.python.org/cpython/file/3.5/Objects/listobject.c

Using %timeit, the performance difference between deque.popleft() and list.pop(0) is about a factor of 4 when both the deque and the list have the same 52 elements and grows to over a factor of 1000 when their lengths are 10**8. Test results are given below.

import string
from collections import deque

%timeit d = deque(string.letters); d.popleft()
1000000 loops, best of 3: 1.46 µs per loop

%timeit d = deque(string.letters)
1000000 loops, best of 3: 1.4 µs per loop

%timeit l = list(string.letters); l.pop(0)
1000000 loops, best of 3: 1.47 µs per loop

%timeit l = list(string.letters);
1000000 loops, best of 3: 1.22 µs per loop

d = deque(range(100000000))

%timeit d.popleft()
10000000 loops, best of 3: 90.5 ns per loop

l = range(100000000)

%timeit l.pop(0)
10 loops, best of 3: 93.4 ms per loop
Answered By: user4322779