Removing Item From List – during iteration – what's wrong with this idiom?

Question:

As an experiment, I did this:

letters=['a','b','c','d','e','f','g','h','i','j','k','l']
for i in letters:
    letters.remove(i)
print letters

The last print shows that not all items were removed ? (every other was).

IDLE 2.6.2      
>>> ================================ RESTART ================================
>>> 
['b', 'd', 'f', 'h', 'j', 'l']
>>> 

What’s the explanation for this ? How it could this be re-written to remove every item ?

Asked By: monojohnny

||

Answers:

what you want to do is:

letters[:] = []

or

del letters[:]

This will preserve original object letters was pointing to. Other options like, letters = [], would create a new object and point letters to it: old object would typically be garbage-collected after a while.

The reason not all values were removed is that you’re changing list while iterating over it.

ETA: if you want to filter values from a list you could use list comprehensions like this:

>>> letters=['a','b','c','d','e','f','g','h','i','j','k','l']
>>> [l for l in letters if ord(l) % 2]
['a', 'c', 'e', 'g', 'i', 'k']
Answered By: SilentGhost

You cannot modify the list you are iterating, otherwise you get this weird type of result. To do this, you must iterate over a copy of the list:

for i in letters[:]:
  letters.remove(i)
Answered By: stanlekub

You cannot iterate over a list and mutate it at the same time, instead iterate over a slice:

letters=['a','b','c','d','e','f','g','h','i','j','k','l']
for i in letters[:]: # note the [:] creates a slice
     letters.remove(i)
print letters

That said, for a simple operation such as this, you should simply use:

letters = []

It removes the first occurrence, and then checks for the next number in the sequence. Since the sequence has changed it takes the next odd number and so on…

  • take “a”
  • remove “a” -> the first item is now “b”
  • take the next item, “c”
    -…
Answered By: mamoo

Probably python uses pointers and the removal starts at the front. The variable „letters“ in the second line partially has a different value than tha variable „letters“ in the third line. When i is 1 then a is being removed, when i is 2 then b had been moved to position 1 and c is being removed. You can try to use „while“.

Answered By: Timo

Some answers explain why this happens and some explain what you should’ve done. I’ll shamelessly put the pieces together.


What’s the reason for this?

Because the Python language is designed to handle this use case differently. The documentation makes it clear:

It is not safe to modify the sequence being iterated over in the loop (this can only happen for mutable sequence types, such as lists). If you need to modify the list you are iterating over (for example, to duplicate selected items) you must iterate over a copy.

Emphasis mine. See the linked page for more — the documentation is copyrighted and all rights are reserved.

You could easily understand why you got what you got, but it’s basically undefined behavior that can easily change with no warning from build to build. Just don’t do it.

It’s like wondering why i += i++ + ++i does whatever the hell it is it that line does on your architecture on your specific build of your compiler for your language — including but not limited to trashing your computer and making demons fly out of your nose 🙂


How it could this be re-written to remove every item?

  • del letters[:] (if you need to change all references to this object)
  • letters[:] = [] (if you need to change all references to this object)
  • letters = [] (if you just want to work with a new object)

Maybe you just want to remove some items based on a condition? In that case, you should iterate over a copy of the list. The easiest way to make a copy is to make a slice containing the whole list with the [:] syntax, like so:

#remove unsafe commands
commands = ["ls", "cd", "rm -rf /"]
for cmd in commands[:]:
  if "rm " in cmd:
    commands.remove(cmd)

If your check is not particularly complicated, you can (and probably should) filter instead:

commands = [cmd for cmd in commands if not is_malicious(cmd)]
Answered By: badp
    #!/usr/bin/env python
    import random
    a=range(10)

    while len(a):
        print a
        for i in a[:]:
            if random.random() > 0.5:
                print "removing: %d" % i
                a.remove(i)
            else:
                print "keeping: %d"  % i           

    print "done!"
    a=range(10)

    while len(a):
        print a
        for i in a:
            if random.random() > 0.5:
                print "removing: %d" % i
                a.remove(i)
            else:
                print "keeping: %d"  % i           

    print "done!"

I think this explains the problem a little better, the top block of code works, whereas the bottom one doesnt.

Items that are “kept” in the bottom list never get printed out, because you are modifiying the list you are iterating over, which is a recipe for disaster.

Answered By: Sam Hodge

OK, I’m a little late to the party here, but I’ve been thinking about this and after looking at Python’s (CPython) implementation code, have an explanation I like. If anyone knows why it’s silly or wrong, I’d appreciate hearing why.

The issue is moving through a list using an iterator, while allowing that list to change.

All the iterator is obliged to do is tell you which item in the (in this case) list comes after the current item (i.e. with the next() function).

I believe the way iterators are currently implemented, they only keep track of the index of the last element they iterated over. Looking in iterobject.c one can see what appears to be a definition of an iterator:

typedef struct {
    PyObject_HEAD
    Py_ssize_t it_index;
    PyObject *it_seq; /* Set to NULL when iterator is exhausted */
} seqiterobject;

where it_seq points to the sequence being iterated over and it_index gives the index of the last item supplied by the iterator.

When the iterator has just supplied the nth item and one deletes that item from the sequence, the correspondence between subsequent list elements and their indices changes. The former (n+1)st item becomes the nth item as far as the iterator is concerned. In other words, the iterator now thinks that what was the ‘next’ item in the sequence is actually the ‘current’ item.

So, when asked to give the next item, it will give the former (n+2)nd item(i.e. the new (n+1)st item).

As a result, for the code in question, the iterator’s next() method is going to give only the n+0, n+2, n+4, … elements from the original list. The n+1, n+3, n+5, … items will never be exposed to the remove statement.

Although the intended activity of the code in question is clear (at least for a person), it would probably require much more introspection for an iterator to monitor changes in the sequence it iterates over and, then, to act in a ‘human’ fashion.

If iterators could return prior or current elements of a sequence, there might be a general work-around, but as it is, you need to iterate over a copy of the list, and be certain not to delete any items before the iterator gets to them.

Answered By: user1245262

Intially i is reference of a as the loop runs the first position element deletes or removes and the second position element occupies the first position but the pointer moves to the second position this goes on so that’s the reason we are not able to delete b,d,f,h,j,l

`

Answered By: Vijayakash Allenki
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.