What's exactly happening in infinite nested lists?

Question:

It’s possible to create an infinite nested list in Python. That’s clear and, although not popular and definitely not useful is a known fact.

>>> a = [0]
>>> a[0] = a
>>> a
[[...]]
>>> a[0] == a
True

My question is, what is happening here:

>>> a = [0]
>>> b = [0]
>>> a[0], b[0] = b, a
>>> a
[[[...]]]
>>> b
[[[...]]]
>>> a == b
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: maximum recursion depth exceeded in cmp
>>> a[0] == b
True
>>> a[0][0] == b
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: maximum recursion depth exceeded in cmp
>>> a[0][0][0] == b
True
>>> 

Which each way deeper, when I’m trying to understand it, I feel much more like my brain is going to explode. I see, that a contains b, that contains a and so on…

Now my questions about this one. Do we really have two lists here, or only one? How does a thing like this gets stored in memory? What could be a purpose to enable programers implement something so strange like this?

Please, don’t treat this question super-serious. And don’t forget, that programming can be fun sometimes.

Asked By: Gandi

||

Answers:

I suspect the following happens:

a[0]==b: Python looks up the value a[0] and finds some kind of reference to b, so it says True.

a[0][0]==b: Python looks up a[0], finds b and now looks up a[0][0], which is, (since a[0] holds b) b[0]. Now it sees, that b[0] holds some kind of reference to a, which is not exactly the same as b. So python has to compare elements, which means, it has to compare a[0] against b[0]. Now, the infinite recursion starts…

Note that this works only because Python does not actually copy the list when assigning a[0]=b. Python rather creates a reference to b that is stored in a[0].

Answered By: phimuemue

These are two lists. First, you create them:

a = [0]
b = [0]

And then, you assign each one to the first element of the other one:

a[0], b[0] = b, a

So you can say

a[0] is b

and

b[0] is a

which is the same situation as the first example, but obe level deeper.

Additionally, you don’t compare for identity (is), but for equality (==). This leads to a try to compare them – deeply inside, what leads to a recursion.

Answered By: glglgl

I see, that a contains b, that contains a

They don’t contain each other as such – A is a reference to a list, the first thing in this list is a reference to B, and vice versa

>>> a[0] == b
True
>>> a[0][0] == b
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: maximum recursion depth exceeded in cmp
>>> a[0][0][0] == b
True

The number of [0]’s here doesn’t matter, as you can do as many list lookups as you like — what matters is that in example #1 and #3 (and all odd numbers of lookups) you are saying “is B equal to B”, at which point python compares memory addresses and sees that they are the same thing, so says yes. With example #2 (and all even lookups) you are saying “is A equal to B”, python sees that they are different memory addresses, and then tries to load the whole (infinite) data structure into memory to do a more in-depth comparison.

Answered By: Shish

a[0] refers to b and b[0] refers to a. This is a circular reference. As glglgl has mentioned, when you use == operator, it does the comparison of values.

Try this, which might make things more clear –

>>> id(a)
4299818696
>>> id(b)
4299818768
>>> id(a[0])
4299818768
>>> 
>>> id(b[0])
4299818696
Answered By: ronakg

Disclaimer: I don’t use Python, so some things I say may be wrong. Python experts, feel free to correct me.

Great question. I think the central misconception (if I can’t even call it that; it’s perfectly reasonable how you arrived at the thought process you used) you’re having that prompts you to ask the question is this:

When I write b[0] = a, it does not mean that a is in b. It means that b includes a reference that points to the thing that a points to.

The variables a and b themselves aren’t even “things” themselves, and they themselves are also merely pointers to otherwise anonymous “things” in memory.

The concept of references is a major leap from the non-programming world, so let’s step through your program with this in mind:

>>> a = [0]

You create a list that happens to have something in it (ignore that for now). What matters is it’s a list. That list gets stored in memory. Let’s say it’s stored in memory location 1001. Then, the assignment = creates a variable a that the programming language allows you to use later. At this point, there’s some list object in memory and a reference to it that you can access with the name a.

>>> b = [0]

This does the same thing for b. There is a new list that gets stored in memory location 1002. The programming language creates a reference b that you can use to refer to the memory location and in turn the list object.

>>> a[0], b[0] = b, a

This does two things that are identical, so let’s focus on one: a[0] = b. What this does is pretty fancy. It first evaluates the right side of the equality, sees the variable b and fetches the corresponding object in memory (memory object #1002) since b is a reference to it. What happens on the left side is equally fancy. a is a variable that points to an list (memory object #1001), but memory object #1001 itself has a number of references of its own. Instead of those references having names like a and b, which you use, those references have numerical indices like 0. So, now, what this does is a pulls up memory object #1001, which is a pile of indexed references, and it goes to the reference with index 0 (previously, this reference pointed to the actual number 0, which is something you did in line 1) and then repoints that reference (i.e., the first and only reference in memory object #1001) to what the thing on the right side of the equation evaluates to. So now, object #1001’s 0th reference points to object #1002.

>>> a
[[[...]]]
>>> b
[[[...]]]

This is just fanciness done by the programming language. When you just ask it to evaluate a, it pulls up the memory object (the list at location #1001), detects using its own magic that it’s infinite and renders itself as such.

>>> a == b
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: maximum recursion depth exceeded in cmp

The failure of this statement has to do with how Python does comparisons. When you compare an object to itself, it immediately evaluates to true. When you compare and object to another object, it uses “magic” to determine whether the equality should be true or false. In the case of lists in Python, it looks at every item in each list and checks if they are equal (in turn using the items’ own equality-checking methods). So, when you try a == b. What it does is first dig up b (object #1002) and a (object #1001) and then realizes that they are different things in memory so goes to its recursive list checker. It does this by iterating through the two lists. Object #1001 has one element with index 0 that points to object #1002. Object #1002 has one element with index 0 that points to object #1001. Therefore, the program concludes that object #1001 and #1002 are equal if all their references point to the same thing, ergo if #1002 (what #1001’s only reference points to) and #1001 (what #1002’s only reference points to) are the same thing. This equality check can never stop. The same thing would happen in any list that doesn’t stop. You could do c = [0]; d = [0]; c[0] = d; d[0] = c and a == c would raise the same error.

>>> a[0] == b
True

As I hinted to in the previous paragraph, this immediately resolves to true because Python takes a shortcut. It doesn’t need to compare list contents because a[0] points to object #1002 and b points to object #1002. Python detects that they are identical in the literal sense (they are the same “thing”) and doesn’t even bother checking contents.

>>> a[0][0] == b
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: maximum recursion depth exceeded in cmp

This goes back to being an error because a[0][0] ends up pointing to object #1001. The identity check fails and falls back on the recursive content check, which never ends.

>>> a[0][0][0] == b
True

Once again, a[0][0][0] points to object #1002, as does b. The recursive check is skipped and the comparison immediately returns true.


Higher level jibber jabber not directly related to your specific code snippet:

  • Since all there is is references referring to other objects, even though there is what appears to be “infinite” nesting, the object referred to by a (as I’ve called object #1001) and the object referred to be b (#1002) are both the same size in memory. And that size is actually incredibly small since all they are are lists that point to the respective other memory locations.
  • It’s also worth note that in less “generous” languages, comparing two references with == returns true only if the memory objects they point to are the same in the sense that both references point to the same spot in memory. Java is an example of this. The stylistic convention that has emerged in such languages is to define a method/function on objects themselves (for Java, it is conventionally called equals()) to do custom equality testing. Python does this out of the box for lists. I don’t know about Python in particular, but at least in Ruby, == is overloaded in the sense that when you do someobject == otherobject, it actually calls a method called == on someobject (that you can overwrite). In theory, there’d be nothing stopping you from making someobject == otherobject return something other than a boolean.
Answered By: Steven
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.