What is the underlying data structure for Python lists?

Question:

What is the typical underlying data structure used to implement Python’s built-in list data type?

Asked By: Nixuz

||

Answers:

CPython:

typedef struct {
    PyObject_VAR_HEAD
    /* Vector of pointers to list elements.  list[0] is ob_item[0], etc. */
    PyObject **ob_item;

    /* ob_item contains space for 'allocated' elements.  The number
     * currently in use is ob_size.
     * Invariants:
     *     0 <= ob_size <= allocated
     *     len(list) == ob_size
     *     ob_item == NULL implies ob_size == allocated == 0
     * list.sort() temporarily sets allocated to -1 to detect mutations.
     *
     * Items must normally not be NULL, except during construction when
     * the list is not yet visible outside the function that builds it.
     */
    Py_ssize_t allocated;
} PyListObject;

As can be seen on the following line, the list is declared as an array of pointers to PyObjects.

PyObject **ob_item;
Answered By: Georg Schölly

In the Jython implementation, it’s an ArrayList<PyObject>.

Answered By: nategood

List objects are implemented as
arrays. They are optimized for fast
fixed-length operations and incur O(n)
memory movement costs for pop(0) and
insert(0, v) operations which change
both the size and position of the
underlying data representation.

See also:
http://docs.python.org/library/collections.html#collections.deque

Btw, I find it interesting that the Python tutorial on data structures recommends using pop(0) to simulate a queue but does not mention O(n) or the deque option.

http://docs.python.org/tutorial/datastructures.html#using-lists-as-queues

Answered By: e1i45

Although it may be obvious, worth stating that Python lists are Dynamic arrays (as opposed to Static arrays). This is an important distinction that comes up in job interview questions/academia.

Because the array is dynamic, Python reserves an amount of memory upon declaration, eg:

somelist = []

Because extra memory already is set aside, performing somelist.append() simply writes to the next reserved memory slot(s) and hence is O(1) most of the time. For a static array, typically the array is full (ie. if have 4 bytes, then array size is 4) and appends would always be O(n) because they require reserving an entirely new set of memory (maybe 5 bytes now) and copying contents over.

Answered By: Adam Hughes

The list is an inbuilt data structure in python. But can be used to create user-defined data structures. Two main user-defined data structures created by lists are stacks and queues.

Answered By: Riya Parashar
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.