Python self-referential list produces weird output on sorted

Question:

I recently came across a weird discrepancy when trying to sort self-referential lists using .sort() and sorted(). I was hoping someone could shed some light on this. The code in question is the following:

lst = [1, 2, 3]

lst[0] = lst
lst[1] = lst
lst[2] = lst

print(lst)

print(sorted(lst))

lst.sort()

print(lst)

The above code produces the following output:

[[...], [...], [...]]
[[[...], [...], [...]], [[...], [...], [...]], [[...], [...], [...]]]
[[...], [...], [...]]

It’s the output from print(sorted(lst)) that baffles me. Wonder if it is some form of recursion that’s causing it?

Asked By: abhishekbasu

||

Answers:

I’m going to call your list x, because frankly l looks too much like the number one and is throwing me off. So you’ve got a list x which looks like this

[x, x, x]

Now, we do

print(x)

Python is smart enough to say "hey, look, this list contains itself recursively, let’s not print it again inside of itself." All of the places x appears in your list, we get [...]

[[...], [...], [...]]

Now consider

x.sort()
print(x)

We sort the list, which doesn’t do much since every element is the same. However, crucially, it all happens in-place. The list started out looking like [x, x, x] and will end looking like [x, x, x], where x is our list. So the print looks the same.

[[...], [...], [...]]

Finally, your fun example.

sorted(x)

sorted, unlike list.sort, does not modify the list and instead produces a new list. Let’s call this new list y. In your x.sort example, at the end, we have the same list x that looks like x = [x, x, x]. When we print the list, we immediately see the recursion and stop printing.

However, sorted(x) produces a new list. The list will still look like [x, x, x], but it’s not the list x. It’s a new list y = [x, x, x].

Now, we do

print(sorted(x))

Python sees a list of three elements: [x, x, x]. We look at each of those elements. We’re printing y, so the fact that this list contains x is not a recursion problem; it’s a perfectly ordinary list that contains other lists. So we print x inside of y. Now, one more layer down, we look inside x and see that it contains, lo and behold, x again. That is a recursion problem, but it happened one step later, because we made a new list that, although it looks identical to the original, is distinct.

[[[...], [...], [...]], [[...], [...], [...]], [[...], [...], [...]]]
Answered By: Silvio Mayolo

Maybe you thought after lst[0] = lst that lst would be [[1,2,3],2,3]. If so, you’d be assuming = lst passes by value. But lists are passed by reference, so at that point lst is [lst,2,3].

To pass by value, use lst[0] = list(lst) etc. This time the right-hand side creates a new list, with the same value as before but a new reference, since in this context list(lst) is syntactic sugar for [v for v in lst].

As @DeepSpace noted, this fact about list also explains why print(list(lst)) is three times more verbose than print(lst); it prints [v for v in lst], which is just [lst, lst, lst].

Answered By: J.G.
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.