Does `b = a` make `b` be the same object? Why?
Question:
Why do I get this result at the REPL?
>>> a = [1, 2, 3]
>>> b = a
>>> b is a
True
My understanding is that is
only evaluates to True
if the two variables point to the same object, when in this case they’re different objects with the same value. ==
would evaluate to True
, but is
shouldn’t.
Going further:
>>> b.reverse()
>>> (a, b)
([3, 2, 1], [3, 2, 1])
It seems that as far as the interpreter is concerned, they are the same object, and operations on b will automatically be performed on a. Again, why?
Answers:
They are actually referencing the same object.
Try this:
a = [1,2,3]
b = a
print b is a
b[0] = 0
print b is a
You will see that both a and b were changed and are still the same as each other.
a = [1,2,3]
b = a
print b is a
You are comparing references to the same list
. If you do the following:
a = [1,2,3]
b = [1,2,3]
print b is a
you should get a False.
a = [81, 82, 83]
b = a
print(a is b) #prints True
this is what actually happens here:
and for something like :
a = [81,82,83]
b = [81,82,83]
print(a is b) # False
print(a == b) #True, as == only checks value equality
In [24]: import sys
In [25]: a=[1,2,3]
In [26]: sys.getrefcount(a) #number of references to [1,2,3] are 2
Out[26]: 2
In [27]: b=a #now b also points to [1,2,3]
In [28]: sys.getrefcount(a) # reference to [1,2,3] got increased by 1,
# as b now also points to [1,2,3]
Out[28]: 3
In [29]: id(a)
Out[29]: 158656524 #both have the same id(), that's why "b is a" is True
In [30]: id(b)
Out[30]: 158656524
When to use the copy
module:
In [1]: a=[1,2,3]
In [2]: b=a
In [3]: id(a),id(b)
Out[3]: (143186380, 143186380) #both point to the same object
In [4]: b=a[:] #now use slicing, it is equivalent to b=copy.copy(a)
# or b= list(a)
In [5]: id(a),id(b)
Out[5]: (143186380, 143185260) #as expected both now point to different objects
# so now changing one will not affect other
In [6]: a=[[1,2],[3,4]] #list of lists
In [7]: b=a[:] #use slicing
In [8]: id(a),id(b) #now both point to different object as expected
# But what about the internal lists?
Out[8]: (143184492, 143186380)
In [11]: [(id(x),id(y)) for (x,y) in zip(a,b)] #so internal list are still same objects
#so doing a[0][3]=5, will changes b[0] too
Out[11]: [(143185036, 143185036), (143167244, 143167244)]
In [12]: from copy import deepcopy #to fix that use deepcopy
In [13]: b=deepcopy(a)
In [14]: [(id(x),id(y)) for (x,y) in zip(a,b)] #now internal lists are different too
Out[14]: [(143185036, 143167052), (143167244, 143166924)]
for more details:
In [32]: def func():
....: a=[1,2,3]
....: b=a
....:
....:
In [34]: import dis
In [35]: dis.dis(func)
2 0 LOAD_CONST 1 (1)
3 LOAD_CONST 2 (2)
6 LOAD_CONST 3 (3)
9 BUILD_LIST 3
12 STORE_FAST 0 (a) #now 'a' poits to [1,2,3]
3 15 LOAD_FAST 0 (a) #load the object referenced by a
18 STORE_FAST 1 (b) #store the object returned by a to b
21 LOAD_CONST 0 (None)
24 RETURN_VALUE
In [36]: def func1():
....: a=[1,2,3]
....: b=[1,2,3]
....:
....:
In [37]: dis.dis(func1) #here both a and b are loaded separately
2 0 LOAD_CONST 1 (1)
3 LOAD_CONST 2 (2)
6 LOAD_CONST 3 (3)
9 BUILD_LIST 3
12 STORE_FAST 0 (a)
3 15 LOAD_CONST 1 (1)
18 LOAD_CONST 2 (2)
21 LOAD_CONST 3 (3)
24 BUILD_LIST 3
27 STORE_FAST 1 (b)
30 LOAD_CONST 0 (None)
33 RETURN_VALUE
I’m not sure if lists work the same but have a look at this from a numpy.array() tutorial regarding shallow and deep copies: http://www.scipy.org/Tentative_NumPy_Tutorial#head-1529ae93dd5d431ffe3a1001a4ab1a394e70a5f2
a = b
simply creates a new reference to the same object. To get a real copy you’ll probably find the the list object has something similar to the deep copy example in the link so b = a.copy()
. Then you could say there are 2 references to two separate objects with the same values.
Also I think most OO languages work like this in that =
just creates a new reference and not a new object.
When you do a = [1, 2, 3]
you’re binding the name a
to a list object. When you do b = a
, you’re binding the name b
to whatever a
is – in this case the list object. Ergo, they’re the same… An object can have multiple names. It’s worth reading up on the Python Data Model.
If you wanted to make a copy of your listobj, then you can look at b = a[:]
to use slice to create a shallow copy, or copy.copy
for a shallow copy (should work on arbitary objects), or copy.deepcopy
for strangely – a deep copy.
You’ll also notice something surprising in CPython which caches short strings/small integers…
>>> a = 4534534
>>> b = a
>>> a is b
True
>>> b = 4534534
>>> a is b
False
>>> a = 1
>>> b = a
>>> a is b
True
>>> b = 1
>>> a is b
True
This code prints True. Why?
Because b is a
.
“is” only returns True if the two variables point to the same object
If they name the same object. “Point to” is vulgar terminology that alludes to a much lower level model of programming.
when in this case they’re different objects with the same value.
No, they aren’t.
In Python, b = a
means “b
shall cease to be a name for whatever it currently names, if anything, and become a name for whatever a
currently names”. The same object. Not a copy.
Things do not get copied implicitly in Python.
prints [3, 2, 1] [3, 2, 1], it seems that as far as the interpreter is concerned, they ARE the same object
Because they are.
and operations on b will automatically be performed on a.
Because they are the same object.
Again, why?
Again, because they are.
…It is as though you thought of every obvious test confirming the behaviour, but refuse to reject your core assumption once every test contradicts it, even though there is nothing in the literature supporting your core assumption (because in fact it is false).
I’ve never seen anything like this happen before.
Then you must never have tested anything like this before in Python, because it has always worked this way in Python. It’s not even that strange among programming languages; Java does the same thing for everything that’s not of a primitive type, and C# does the same thing for classes (reference types) while doing what you apparently expect for structs (value types). It’s called “reference semantics” and it’s by no means a new idea.
Why do I get this result at the REPL?
>>> a = [1, 2, 3]
>>> b = a
>>> b is a
True
My understanding is that is
only evaluates to True
if the two variables point to the same object, when in this case they’re different objects with the same value. ==
would evaluate to True
, but is
shouldn’t.
Going further:
>>> b.reverse()
>>> (a, b)
([3, 2, 1], [3, 2, 1])
It seems that as far as the interpreter is concerned, they are the same object, and operations on b will automatically be performed on a. Again, why?
They are actually referencing the same object.
Try this:
a = [1,2,3]
b = a
print b is a
b[0] = 0
print b is a
You will see that both a and b were changed and are still the same as each other.
a = [1,2,3]
b = a
print b is a
You are comparing references to the same list
. If you do the following:
a = [1,2,3]
b = [1,2,3]
print b is a
you should get a False.
a = [81, 82, 83]
b = a
print(a is b) #prints True
this is what actually happens here:
and for something like :
a = [81,82,83]
b = [81,82,83]
print(a is b) # False
print(a == b) #True, as == only checks value equality
In [24]: import sys
In [25]: a=[1,2,3]
In [26]: sys.getrefcount(a) #number of references to [1,2,3] are 2
Out[26]: 2
In [27]: b=a #now b also points to [1,2,3]
In [28]: sys.getrefcount(a) # reference to [1,2,3] got increased by 1,
# as b now also points to [1,2,3]
Out[28]: 3
In [29]: id(a)
Out[29]: 158656524 #both have the same id(), that's why "b is a" is True
In [30]: id(b)
Out[30]: 158656524
When to use the copy
module:
In [1]: a=[1,2,3]
In [2]: b=a
In [3]: id(a),id(b)
Out[3]: (143186380, 143186380) #both point to the same object
In [4]: b=a[:] #now use slicing, it is equivalent to b=copy.copy(a)
# or b= list(a)
In [5]: id(a),id(b)
Out[5]: (143186380, 143185260) #as expected both now point to different objects
# so now changing one will not affect other
In [6]: a=[[1,2],[3,4]] #list of lists
In [7]: b=a[:] #use slicing
In [8]: id(a),id(b) #now both point to different object as expected
# But what about the internal lists?
Out[8]: (143184492, 143186380)
In [11]: [(id(x),id(y)) for (x,y) in zip(a,b)] #so internal list are still same objects
#so doing a[0][3]=5, will changes b[0] too
Out[11]: [(143185036, 143185036), (143167244, 143167244)]
In [12]: from copy import deepcopy #to fix that use deepcopy
In [13]: b=deepcopy(a)
In [14]: [(id(x),id(y)) for (x,y) in zip(a,b)] #now internal lists are different too
Out[14]: [(143185036, 143167052), (143167244, 143166924)]
for more details:
In [32]: def func():
....: a=[1,2,3]
....: b=a
....:
....:
In [34]: import dis
In [35]: dis.dis(func)
2 0 LOAD_CONST 1 (1)
3 LOAD_CONST 2 (2)
6 LOAD_CONST 3 (3)
9 BUILD_LIST 3
12 STORE_FAST 0 (a) #now 'a' poits to [1,2,3]
3 15 LOAD_FAST 0 (a) #load the object referenced by a
18 STORE_FAST 1 (b) #store the object returned by a to b
21 LOAD_CONST 0 (None)
24 RETURN_VALUE
In [36]: def func1():
....: a=[1,2,3]
....: b=[1,2,3]
....:
....:
In [37]: dis.dis(func1) #here both a and b are loaded separately
2 0 LOAD_CONST 1 (1)
3 LOAD_CONST 2 (2)
6 LOAD_CONST 3 (3)
9 BUILD_LIST 3
12 STORE_FAST 0 (a)
3 15 LOAD_CONST 1 (1)
18 LOAD_CONST 2 (2)
21 LOAD_CONST 3 (3)
24 BUILD_LIST 3
27 STORE_FAST 1 (b)
30 LOAD_CONST 0 (None)
33 RETURN_VALUE
I’m not sure if lists work the same but have a look at this from a numpy.array() tutorial regarding shallow and deep copies: http://www.scipy.org/Tentative_NumPy_Tutorial#head-1529ae93dd5d431ffe3a1001a4ab1a394e70a5f2
a = b
simply creates a new reference to the same object. To get a real copy you’ll probably find the the list object has something similar to the deep copy example in the link so b = a.copy()
. Then you could say there are 2 references to two separate objects with the same values.
Also I think most OO languages work like this in that =
just creates a new reference and not a new object.
When you do a = [1, 2, 3]
you’re binding the name a
to a list object. When you do b = a
, you’re binding the name b
to whatever a
is – in this case the list object. Ergo, they’re the same… An object can have multiple names. It’s worth reading up on the Python Data Model.
If you wanted to make a copy of your listobj, then you can look at b = a[:]
to use slice to create a shallow copy, or copy.copy
for a shallow copy (should work on arbitary objects), or copy.deepcopy
for strangely – a deep copy.
You’ll also notice something surprising in CPython which caches short strings/small integers…
>>> a = 4534534
>>> b = a
>>> a is b
True
>>> b = 4534534
>>> a is b
False
>>> a = 1
>>> b = a
>>> a is b
True
>>> b = 1
>>> a is b
True
This code prints True. Why?
Because b is a
.
“is” only returns True if the two variables point to the same object
If they name the same object. “Point to” is vulgar terminology that alludes to a much lower level model of programming.
when in this case they’re different objects with the same value.
No, they aren’t.
In Python, b = a
means “b
shall cease to be a name for whatever it currently names, if anything, and become a name for whatever a
currently names”. The same object. Not a copy.
Things do not get copied implicitly in Python.
prints [3, 2, 1] [3, 2, 1], it seems that as far as the interpreter is concerned, they ARE the same object
Because they are.
and operations on b will automatically be performed on a.
Because they are the same object.
Again, why?
Again, because they are.
…It is as though you thought of every obvious test confirming the behaviour, but refuse to reject your core assumption once every test contradicts it, even though there is nothing in the literature supporting your core assumption (because in fact it is false).
I’ve never seen anything like this happen before.
Then you must never have tested anything like this before in Python, because it has always worked this way in Python. It’s not even that strange among programming languages; Java does the same thing for everything that’s not of a primitive type, and C# does the same thing for classes (reference types) while doing what you apparently expect for structs (value types). It’s called “reference semantics” and it’s by no means a new idea.