Deleting elements from list or recreating new list (python)


What it the best/fastest way to delete objects from list?
Deleting some objects:

[objects.remove(o) for o in objects if o.x <= 0]

or recreating new object:

new_objects = [o for o in objects if o.x > 0]
Asked By: Scott



For starters, don’t use list comprehensions for side effects. It needlessly creates a list of None’s here, which is simply ignored and garbage collected, and is just bad style. List comprehensions are for functional mapping/filtering operations to create new lists.

However, even converted to an equivalent loop there is a classic bug:

>>> objects = [1,1,2,2,1,1]
>>> for obj in objects:
...     if obj == 2:
...         objects.remove(obj)
>>> objects
[1, 1, 2, 1, 1]

This is because the internal iterator essentially keeps and index which it simply increments. Since the list changes size by removing an item, every index is shifted down, and an item is skipped. So when there are two matching items to be removed in a row, one is skipped.

But more to the point, removing from a list in a loop is inefficient even if you do it correctly.

It is quadratic time, whereas creating the new list is linear. So really, I think those are the clear advantages of creating a new list.

As pointed out by @Selcuk in the comments, the advantage of modifying the list is that you don’t use auxiliary space.

Answered By: juanpa.arrivillaga

There is a potential issue with modifying a list this way while iterating over it. When you remove an item from the list, the indices of the remaining items shift down by one, which might be an issue if you are trying to access it by index.

The best method is using a del statement as it is faster than remove() method, because it avoids the need to search the list for the item to remove.

    i = 0
while i < len(objects):
    if objects[i].x <= 0:
        del objects[i]
        i += 1

Just the the readability of code has decreased now .

For question of recreating vs updating , no doubt recreating is faster as while updating the index gets changed again and again , creating a new list avoids the need to shift the indices of the remaining items in the list.
But it can increase the space complexity by huge amount if the list is large.

For a faster way of your problem you can consider a generator expression instead of a list comprehension. A generator expression is similar to a list comprehension, but it produces a generator object that can be iterated over lazily, rather than creating a new list in memory.

new_objects = (o for o in objects if o.x > 0)

For more information about generator expression , you can check this out : Generator expressions vs. list comprehensions

Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.