How come list element lookup is O(1) in Python?

Question:

Today in class, we learned that retrieving an element from a list is O(1) in Python. Why is this the case? Suppose I have a list of four items, for example:

li = ["perry", 1, 23.5, "s"]

These items have different sizes in memory. And so it is not possible to take the memory location of li[0] and add three times the size of each element to get the memory location of li[3]. So how does the interpreter know where li[3] is without having to traverse the list in order to retrieve the element?

Asked By: Teererai Marange

||

Answers:

A list in Python is implemented as an array of pointers1. So, what’s really happening when you create the list:

["perry", 1, 23.5, "s"]

is that you are actually creating an array of pointers like so:

[0xa3d25342, 0x635423fa, 0xff243546, 0x2545fade]

Each pointer “points” to the respective objects in memory, so that the string "perry" will be stored at address 0xa3d25342 and the number 1 will be stored at 0x635423fa, etc.

Since all pointers are the same size, the interpreter can in fact add 3 times the size of an element to the address of li[0] to get to the pointer stored at li[3].


1 Get more details from: the horse’s mouth (CPython source code on GitHub).

Answered By: DJG

When you say a = [...], a is effectively a pointer to a PyObject containing an array of pointers to PyObjects.

When you ask for a[2], the interpreter first follows the pointer to the list’s PyObject, then adds 2 to the address of the array inside it, then returns that pointer. The same happens if you ask for a[0] or a[9999].

Basically, all Python objects are accessed by reference instead of by value, even integer literals like 2. There are just some tricks in the pointer system to keep this all efficient. And pointers have a known size, so they can be stored conveniently in C-style arrays.

Answered By: Draconis

Short answer: Python lists are arrays.

Long answer: The computer science term list usually means either a singly-linked list (as used in functional programming) or a doubly-linked list (as used in procedural programming). These data structures support O(1) insertion at either the head of the list (functionally) or at any position that does not need to be searched for (procedurally). A Python “list” has none of these characteristics. Instead it supports (amortized) O(1) appending at the end of the list (like a C++ std::vector or Java ArrayList). Python lists are really resizable arrays in CS terms.

The following comment from the Python documentation explains some of the performance characteristics of Python “lists”:

It is also possible to use a list as a queue, where the first element added is the first element retrieved (“first-in, first-out”); however, lists are not efficient for this purpose. While appends and pops from the end of list are fast, doing inserts or pops from the beginning of a list is slow (because all of the other elements have to be shifted by one).

Answered By: hkBst
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.