Python List & for-each access (Find/Replace in built-in list)

Question:

I originally thought Python was a pure pass-by-reference language.

Coming from C/C++ I can’t help but think about memory management, and it’s hard to put it out of my head. So I’m trying to think of it from a Java perspective and think of everything but primitives as a pass by reference.

Problem: I have a list, containing a bunch of instances of a user defined class.

If I use the for-each syntax, ie.:

for member in my_list:
    print(member.str);

Is member the equivalent of an actual reference to the object?

Is it the equivalent of doing:

i = 0
while i < len(my_list):
    print(my_list[i])
    i += 1

I think it’s NOT, because when I’m looking to do a replace, it doesn’t work, that is, this doesn’t work:

for member in my_list:
    if member == some_other_obj:
        member = some_other_obj

A simple find and replace in a list. Can that be done in a for-each loop, if so, how? Else, do I simply have to use the random access syntax (square brackets), or will NEITHER work and I need to remove the entry, and insert a new one? I.e.:

i = 0
for member in my_list:
   if member == some_other_obj:
      my_list.remove(i)
      my_list.insert(i, member)
   i += 1
Asked By: Syndacate

||

Answers:

Answering this has been good, as the comments have led to an improvement in my own understanding of Python variables.

As noted in the comments, when you loop over a list with something like for member in my_list the member variable is bound to each successive list element. However, re-assigning that variable within the loop doesn’t directly affect the list itself. For example, this code won’t change the list:

my_list = [1,2,3]
for member in my_list:
    member = 42
print my_list

Output:

[1, 2, 3]

If you want to change a list containing immutable types, you need to do something like:

my_list = [1,2,3]
for ndx, member in enumerate(my_list):
    my_list[ndx] += 42
print my_list

Output:

[43, 44, 45]

If your list contains mutable objects, you can modify the current member object directly:

class C:
    def __init__(self, n):
        self.num = n
    def __repr__(self):
        return str(self.num)

my_list = [C(i) for i in xrange(3)]
for member in my_list:
    member.num += 42
print my_list

[42, 43, 44]

Note that you are still not changing the list, simply modifying the objects in the list.

You might benefit from reading Naming and Binding.

Answered By: GreenMatt

You could replace something in there by getting the index along with the item.

>>> foo = ['a', 'b', 'c', 'A', 'B', 'C']
>>> for index, item in enumerate(foo):
...     print(index, item)
...
(0, 'a')
(1, 'b')
(2, 'c')
(3, 'A')
(4, 'B')
(5, 'C')
>>> for index, item in enumerate(foo):
...     if item in ('a', 'A'):
...         foo[index] = 'replaced!'
...
>>> foo
['replaced!', 'b', 'c', 'replaced!', 'B', 'C']

Note that if you want to remove something from the list you have to iterate over a copy of the list, else you will get errors since you’re trying to change the size of something you are iterating over. This can be done quite easily with slices.

Wrong:

>>> foo = ['a', 'b', 'c', 1, 2, 3]
>>> for item in foo:
...     if isinstance(item, int):
...         foo.remove(item)
...
>>> foo 
['a', 'b', 'c', 2]

The 2 is still in there because we modified the size of the list as we iterated over it. The correct way would be:

>>> foo = ['a', 'b', 'c', 1, 2, 3]
>>> for item in foo[:]:
...     if isinstance(item, int):
...         foo.remove(item)
...
>>> foo 
['a', 'b', 'c']
Answered By: Glider

Python is not Java, nor C/C++ — you need to stop thinking that way to really utilize the power of Python.

Python does not have pass-by-value, nor pass-by-reference, but instead uses pass-by-name (or pass-by-object) — in other words, nearly everything is bound to a name that you can then use (the two obvious exceptions being tuple- and list-indexing).

When you do spam = "green", you have bound the name spam to the string object "green"; if you then do eggs = spam you have not copied anything, you have not made reference pointers; you have simply bound another name, eggs, to the same object ("green" in this case). If you then bind spam to something else (spam = 3.14159) eggs will still be bound to "green".

When a for-loop executes, it takes the name you give it, and binds it in turn to each object in the iterable while running the loop; when you call a function, it takes the names in the function header and binds them to the arguments passed; reassigning a name is actually rebinding a name (it can take a while to absorb this — it did for me, anyway).

With for-loops utilizing lists, there are two basic ways to assign back to the list:

for i, item in enumerate(some_list):
    some_list[i] = process(item)

or

new_list = []
for item in some_list:
    new_list.append(process(item))
some_list[:] = new_list

Notice the [:] on that last some_list — it is causing a mutation of some_list‘s elements (setting the entire thing to new_list‘s elements) instead of rebinding the name some_list to new_list. Is this important? It depends! If you have other names besides some_list bound to the same list object, and you want them to see the updates, then you need to use the slicing method; if you don’t, or if you do not want them to see the updates, then rebind — some_list = new_list.

Answered By: Ethan Furman