Is there a difference between [] and list() when using id()?

Question:

Can somebody explain the following?

Why is the id the same, but the lists are different?

>>> [] is []
False
>>> id([]) == id([])
True

Is there difference in list creation?

>>> id(list()) == id(list())
False
>>> id([]) == id([])
True

Why is this happening? I get two different lists. Why not only one, or three or more?

>>> [].__repr__
<method-wrapper '__repr__' of list object at 0x7fd2be868128>
>>> [].__repr__
<method-wrapper '__repr__' of list object at 0x7fd2be868170>
>>> [].__repr__
<method-wrapper '__repr__' of list object at 0x7fd2be868128>
>>> [].__repr__
<method-wrapper '__repr__' of list object at 0x7fd2be868170>
Asked By: Vlad Okrimenko

||

Answers:

You used id() wrong. id([]) takes the memory id of an object that is discarded immediately. After all, nothing is referencing it anymore once id() is done with it. So the next time you use id([]) Python sees an opportunity to re-use the memory and lo and behold, those addresses are indeed the same.

However, this is an implementation detail, one you can’t rely on, and it won’t always be able to reuse the memory address.

Note that id() values are only unique for the lifetime of the object, see the documentation:

This is an integer which is guaranteed to be unique and constant for this object during its lifetime. Two objects with non-overlapping lifetimes may have the same id() value.

(Bold emphasis mine).

That id(list()) can’t re-use the memory location is probably due to the extra heap mutations caused by pushing the current frame on the stack to call a function, then popping it again when the list() call returns.

Both [] and list() produce a new empty list object; but you need to first create references to those separate lists (here a and b):

>>> a, b = [], []
>>> a is b
False
>>> id(a) == id(b)
False
>>> a, b = list(), list()
>>> a is b
False
>>> id(a) == id(b)
False

The same happens when you used [].__repr__. The Python interactive interpreter has a special global name, _, that you can use to reference the last result produced:

>>> [].__repr__
<method-wrapper '__repr__' of list object at 0x10e011608>
>>> _
<method-wrapper '__repr__' of list object at 0x10e011608>

That creates an extra reference, so the __repr__ method, and by extension, the empty list you created for it, are still considered active. The memory location is not freed and not available for the next list you create.

But executing [].__repr__ again, Python now binds _ to that new method object. Suddenly the previous __repr__ method is no longer referenced by anything and can be freed, and so is the list object.

The third time you execute [].__repr__ the first memory location is available again for reuse, so Python does just that:

>>> [].__repr__  # create a new method
<method-wrapper '__repr__' of list object at 0x10e00cb08>
>>> _            # now _ points to the new method
<method-wrapper '__repr__' of list object at 0x10e00cb08>
>>> [].__repr__  # so the old address can be reused
<method-wrapper '__repr__' of list object at 0x10e011608>

You never create more than two lists; the previous one (still referenced by _) and the current one. If you wanted to see more memory locations, use variables to add another reference.

Answered By: Martijn Pieters
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.