Does Python make a copy of objects on assignment?
Question:
When I try this code:
dict_a = dict_b = dict_c = {}
dict_c['hello'] = 'goodbye'
print(dict_a)
print(dict_b)
print(dict_c)
I expected that it would just initialise the dict_a
, dict_b
and dict_c
dictionaries, and then assign a key in dict_c
, resulting in
{}
{}
{'hello': 'goodbye'}
But it seems to have a copy-through effect instead:
{'hello': 'goodbye'}
{'hello': 'goodbye'}
{'hello': 'goodbye'}
Why?
Answers:
This is because in Python, variables (names) are just references to individual objects. When you assign dict_a = dict_b
, you are really copying a memory address (or pointer, if you will) from dict_b
to dict_a
. There is still one instance of that dictionary.
To get the desired behavior, use either the dict.copy
method, or use copy.deepcopy
if your dict may have nested dicts or other nested objects.
>>> a = {1:2}
>>> b = a.copy()
>>> b
{1: 2}
>>> b[3] = 4
>>> a
{1: 2}
>>> b
{1: 2, 3: 4}
>>>
Your first assignment assigns the same dictionary object to the variables dict_a, dict_b, and dict_c. It is equivalent to dict_c = {}; dict_b = dict_c; dict_a = dict_c.
As danben previously said, you’re just copying the same dict into 3 variables, so that each one refers to the same object.
To get the behaviour you want, you should instantiate a different dict in each variable:
>>> dict_a, dict_b, dict_c = {}, {}, {}
>>> dict_c['hello'] = 'goodbye'
>>> print dict_a
{}
>>> print dict_b
{}
>>> print dict_c
{'hello': 'goodbye'}
>>>
Even though
>>> dict_a, dict_b, dict_c = {}, {}, {}
is the right way to go in most cases, when it get more than 3 it looks weird
Imagine
>>> a, b, c, d, e, f = {}, {}, {}, {}, {}, {}
In cases where I wanna initialize more than 3 things, I use
>>> a, b, c, d, e, f, = [dict() for x in range(6)]
I agree with what is said above. The key here is that, in Python, assignments represent references to the object.
I was trying to grasp the concept myself and I think is it important to understand in which case a new object is created and when is the existing one changed.
In the example above, the line:
dict_c['hello'] = 'goodbye'
doesn’t create a new object. It only changes the object which is referenced by dict_a, dict_b, and dict_c.
If, instead, you wrote:
dict_c = {'hello': 'goodbye'}
it would create a new object which would be referenced by dict_c. Dict_a and dict_b would still be pointing to the empty object.
In that case, if you run:
print dict_a
print dict_b
print dict_c
you would get:
{}
{}
{'hello': 'goodbye'}
When I try this code:
dict_a = dict_b = dict_c = {}
dict_c['hello'] = 'goodbye'
print(dict_a)
print(dict_b)
print(dict_c)
I expected that it would just initialise the dict_a
, dict_b
and dict_c
dictionaries, and then assign a key in dict_c
, resulting in
{}
{}
{'hello': 'goodbye'}
But it seems to have a copy-through effect instead:
{'hello': 'goodbye'}
{'hello': 'goodbye'}
{'hello': 'goodbye'}
Why?
This is because in Python, variables (names) are just references to individual objects. When you assign dict_a = dict_b
, you are really copying a memory address (or pointer, if you will) from dict_b
to dict_a
. There is still one instance of that dictionary.
To get the desired behavior, use either the dict.copy
method, or use copy.deepcopy
if your dict may have nested dicts or other nested objects.
>>> a = {1:2}
>>> b = a.copy()
>>> b
{1: 2}
>>> b[3] = 4
>>> a
{1: 2}
>>> b
{1: 2, 3: 4}
>>>
Your first assignment assigns the same dictionary object to the variables dict_a, dict_b, and dict_c. It is equivalent to dict_c = {}; dict_b = dict_c; dict_a = dict_c.
As danben previously said, you’re just copying the same dict into 3 variables, so that each one refers to the same object.
To get the behaviour you want, you should instantiate a different dict in each variable:
>>> dict_a, dict_b, dict_c = {}, {}, {}
>>> dict_c['hello'] = 'goodbye'
>>> print dict_a
{}
>>> print dict_b
{}
>>> print dict_c
{'hello': 'goodbye'}
>>>
Even though
>>> dict_a, dict_b, dict_c = {}, {}, {}
is the right way to go in most cases, when it get more than 3 it looks weird
Imagine
>>> a, b, c, d, e, f = {}, {}, {}, {}, {}, {}
In cases where I wanna initialize more than 3 things, I use
>>> a, b, c, d, e, f, = [dict() for x in range(6)]
I agree with what is said above. The key here is that, in Python, assignments represent references to the object.
I was trying to grasp the concept myself and I think is it important to understand in which case a new object is created and when is the existing one changed.
In the example above, the line:
dict_c['hello'] = 'goodbye'
doesn’t create a new object. It only changes the object which is referenced by dict_a, dict_b, and dict_c.
If, instead, you wrote:
dict_c = {'hello': 'goodbye'}
it would create a new object which would be referenced by dict_c. Dict_a and dict_b would still be pointing to the empty object.
In that case, if you run:
print dict_a
print dict_b
print dict_c
you would get:
{}
{}
{'hello': 'goodbye'}