Why is b.pop(0) over 200 times slower than del b[0] for bytearray?

Question:

Letting them compete three times (a million pops/dels each time):

from timeit import timeit

for _ in range(3):
    t1 = timeit('b.pop(0)', 'b = bytearray(1000000)')
    t2 = timeit('del b[0]', 'b = bytearray(1000000)')
    print(t1 / t2)

Time ratios (Try it online!):

274.6037053753368
219.38099365582403
252.08691226683823

Why is pop that much slower at doing the same thing?

Asked By: Kelly Bundy

||

Answers:

When you run b.pop(0), Python moves all the elements back by one as you might expect. This takes O(n) time.

When you del b[0], Python simply increases the start pointer of the object by 1.

In both cases, PyByteArray_Resize is called to adjust the size. When the new size is smaller than half the allocated size, the allocated memory will be shrunk. In the del b[0] case, this is the only point where the data will be copied. As a result, this case will take O(1) amortized time.

Relevant code:

bytearray_pop_impl function: Always calls

memmove(buf + index, buf + index + 1, n - index);

The bytearray_setslice_linear function is called for del b[0] with lo == 0, hi == 1, bytes_len == 0. It reaches this code (with growth == -1):

if (lo == 0) {
    /* Shrink the buffer by advancing its logical start */
    self->ob_start -= growth;
    /*
      0   lo               hi             old_size
      |   |<----avail----->|<-----tail------>|
      |      |<-bytes_len->|<-----tail------>|
      0    new_lo         new_hi          new_size
    */
}
else {
    /*
      0   lo               hi               old_size
      |   |<----avail----->|<-----tomove------>|
      |   |<-bytes_len->|<-----tomove------>|
      0   lo         new_hi              new_size
    */
    memmove(buf + lo + bytes_len, buf + hi,
            Py_SIZE(self) - hi);
}
Answered By: interjay

I have to admit, I was very surprised by the timings myself. After convincing myself that they were in fact correct, I took a dive into the CPython source code, and I think I found the answer- cpython optimizes del bytearr[0:x], by just incrementing the pointer to the start of the array:

    if (growth < 0) {
        if (!_canresize(self))
            return -1;

        if (lo == 0) {
            /* Shrink the buffer by advancing its logical start */
            self->ob_start -= growth;

You can find the del bytearray[...] logic here (implemented via bytearray_setslice, with values being NULL), which in turn calls bytearray_setslice_linear, which contains the above optimization.

For comparison, bytearray.pop does NOT implement this optimization- see here in the source code.

Answered By: Dillon Davis

b.pop(0) is slower than del b[0] for a bytearray because b.pop(0) is implemented by creating a new bytearray object that contains all elements except the first one, and then assigning it to the original bytearray object. This is because a bytearray object in Python is mutable, and its elements are stored in a contiguous block of memory, which can be resized if necessary.

On the other hand, del b[0] is implemented by shifting all the elements of the bytearray to the left by one position, effectively deleting the first element of the bytearray. This is an in-place operation and does not require creating a new object.

Creating a new bytearray object and assigning it to the original object is a more expensive operation than an in-place modification because it requires allocating new memory and copying the elements to the new memory location. As a result, b.pop(0) is over 200 times slower than del b[0] for a bytearray.

It is worth noting that b.pop(0) can be useful in situations where you need to get the value of the first element while also removing it from the bytearray. If you only need to remove the first element, del b[0] is a more efficient choice.

Answered By: anmaia