Why updating "shallow" copy dictionary doesn't update "original" dictionary?
Question:
While reading up the documentation for dict.copy()
, it says that it makes a shallow copy of the dictionary. Same goes for the book I am following (Beazley’s Python Reference), which says:
The m.copy() method makes a shallow
copy of the items contained in a
mapping object and places them in a
new mapping object.
Consider this:
>>> original = dict(a=1, b=2)
>>> new = original.copy()
>>> new.update({'c': 3})
>>> original
{'a': 1, 'b': 2}
>>> new
{'a': 1, 'c': 3, 'b': 2}
So I assumed this would update the value of original
(and add ‘c’: 3) also since I was doing a shallow copy. Like if you do it for a list:
>>> original = [1, 2, 3]
>>> new = original
>>> new.append(4)
>>> new, original
([1, 2, 3, 4], [1, 2, 3, 4])
This works as expected.
Since both are shallow copies, why is that the dict.copy()
doesn’t work as I expect it to? Or my understanding of shallow vs deep copying is flawed?
Answers:
By “shallow copying” it means the content of the dictionary is not copied by value, but just creating a new reference.
>>> a = {1: [1,2,3]}
>>> b = a.copy()
>>> a, b
({1: [1, 2, 3]}, {1: [1, 2, 3]})
>>> a[1].append(4)
>>> a, b
({1: [1, 2, 3, 4]}, {1: [1, 2, 3, 4]})
In contrast, a deep copy will copy all contents by value.
>>> import copy
>>> c = copy.deepcopy(a)
>>> a, c
({1: [1, 2, 3, 4]}, {1: [1, 2, 3, 4]})
>>> a[1].append(5)
>>> a, c
({1: [1, 2, 3, 4, 5]}, {1: [1, 2, 3, 4]})
So:
-
b = a
: Reference assignment, Make a
and b
points to the same object.
![Illustration of 'a = b': 'a' and 'b' both point to '{1: L}', 'L' points to '[1, 2, 3]'.](https://i.stack.imgur.com/4AQC6.png)
-
b = a.copy()
: Shallow copying, a
and b
will become two isolated objects, but their contents still share the same reference
![Illustration of 'b = a.copy()': 'a' points to '{1: L}', 'b' points to '{1: M}', 'L' and 'M' both point to '[1, 2, 3]'.](https://i.stack.imgur.com/Vtk4m.png)
-
b = copy.deepcopy(a)
: Deep copying, a
and b
‘s structure and content become completely isolated.
![Illustration of 'b = copy.deepcopy(a)': 'a' points to '{1: L}', 'L' points to '[1, 2, 3]'; 'b' points to '{1: M}', 'M' points to a different instance of '[1, 2, 3]'.](https://i.stack.imgur.com/BO4qO.png)
“new” and “original” are different dicts, that’s why you can update just one of them.. The items are shallow-copied, not the dict itself.
Contents are shallow copied.
So if the original dict
contains a list
or another dictionary
, modifying one them in the original or its shallow copy will modify them (the list
or the dict
) in the other.
Take this example:
original = dict(a=1, b=2, c=dict(d=4, e=5))
new = original.copy()
Now let’s change a value in the ‘shallow’ (first) level:
new['a'] = 10
# new = {'a': 10, 'b': 2, 'c': {'d': 4, 'e': 5}}
# original = {'a': 1, 'b': 2, 'c': {'d': 4, 'e': 5}}
# no change in original, since ['a'] is an immutable integer
Now let’s change a value one level deeper:
new['c']['d'] = 40
# new = {'a': 10, 'b': 2, 'c': {'d': 40, 'e': 5}}
# original = {'a': 1, 'b': 2, 'c': {'d': 40, 'e': 5}}
# new['c'] points to the same original['d'] mutable dictionary, so it will be changed
It’s not a matter of deep copy or shallow copy, none of what you’re doing is deep copy.
Here:
>>> new = original
you’re creating a new reference to the the list/dict referenced by original.
while here:
>>> new = original.copy()
>>> # or
>>> new = list(original) # dict(original)
you’re creating a new list/dict which is filled with a copy of the references of objects contained in the original container.
Adding to kennytm’s answer. When you do a shallow copy parent.copy() a new dictionary is created with same keys,but the values are not copied they are referenced.If you add a new value to parent_copy it won’t effect parent because parent_copy is a new dictionary not reference.
parent = {1: [1,2,3]}
parent_copy = parent.copy()
parent_reference = parent
print id(parent),id(parent_copy),id(parent_reference)
#140690938288400 140690938290536 140690938288400
print id(parent[1]),id(parent_copy[1]),id(parent_reference[1])
#140690938137128 140690938137128 140690938137128
parent_copy[1].append(4)
parent_copy[2] = ['new']
print parent, parent_copy, parent_reference
#{1: [1, 2, 3, 4]} {1: [1, 2, 3, 4], 2: ['new']} {1: [1, 2, 3, 4]}
The hash(id) value of parent[1], parent_copy[1] are identical which implies [1,2,3] of parent[1] and parent_copy[1] stored at id 140690938288400.
But hash of parent and parent_copy are different which implies
They are different dictionaries and parent_copy is a new dictionary having values reference to values of parent
In your second part, you should use new = original.copy()
.copy
and =
are different things.
While reading up the documentation for dict.copy()
, it says that it makes a shallow copy of the dictionary. Same goes for the book I am following (Beazley’s Python Reference), which says:
The m.copy() method makes a shallow
copy of the items contained in a
mapping object and places them in a
new mapping object.
Consider this:
>>> original = dict(a=1, b=2)
>>> new = original.copy()
>>> new.update({'c': 3})
>>> original
{'a': 1, 'b': 2}
>>> new
{'a': 1, 'c': 3, 'b': 2}
So I assumed this would update the value of original
(and add ‘c’: 3) also since I was doing a shallow copy. Like if you do it for a list:
>>> original = [1, 2, 3]
>>> new = original
>>> new.append(4)
>>> new, original
([1, 2, 3, 4], [1, 2, 3, 4])
This works as expected.
Since both are shallow copies, why is that the dict.copy()
doesn’t work as I expect it to? Or my understanding of shallow vs deep copying is flawed?
By “shallow copying” it means the content of the dictionary is not copied by value, but just creating a new reference.
>>> a = {1: [1,2,3]}
>>> b = a.copy()
>>> a, b
({1: [1, 2, 3]}, {1: [1, 2, 3]})
>>> a[1].append(4)
>>> a, b
({1: [1, 2, 3, 4]}, {1: [1, 2, 3, 4]})
In contrast, a deep copy will copy all contents by value.
>>> import copy
>>> c = copy.deepcopy(a)
>>> a, c
({1: [1, 2, 3, 4]}, {1: [1, 2, 3, 4]})
>>> a[1].append(5)
>>> a, c
({1: [1, 2, 3, 4, 5]}, {1: [1, 2, 3, 4]})
So:
-
b = a
: Reference assignment, Makea
andb
points to the same object. -
b = a.copy()
: Shallow copying,a
andb
will become two isolated objects, but their contents still share the same reference -
b = copy.deepcopy(a)
: Deep copying,a
andb
‘s structure and content become completely isolated.
“new” and “original” are different dicts, that’s why you can update just one of them.. The items are shallow-copied, not the dict itself.
Contents are shallow copied.
So if the original dict
contains a list
or another dictionary
, modifying one them in the original or its shallow copy will modify them (the list
or the dict
) in the other.
Take this example:
original = dict(a=1, b=2, c=dict(d=4, e=5))
new = original.copy()
Now let’s change a value in the ‘shallow’ (first) level:
new['a'] = 10
# new = {'a': 10, 'b': 2, 'c': {'d': 4, 'e': 5}}
# original = {'a': 1, 'b': 2, 'c': {'d': 4, 'e': 5}}
# no change in original, since ['a'] is an immutable integer
Now let’s change a value one level deeper:
new['c']['d'] = 40
# new = {'a': 10, 'b': 2, 'c': {'d': 40, 'e': 5}}
# original = {'a': 1, 'b': 2, 'c': {'d': 40, 'e': 5}}
# new['c'] points to the same original['d'] mutable dictionary, so it will be changed
It’s not a matter of deep copy or shallow copy, none of what you’re doing is deep copy.
Here:
>>> new = original
you’re creating a new reference to the the list/dict referenced by original.
while here:
>>> new = original.copy()
>>> # or
>>> new = list(original) # dict(original)
you’re creating a new list/dict which is filled with a copy of the references of objects contained in the original container.
Adding to kennytm’s answer. When you do a shallow copy parent.copy() a new dictionary is created with same keys,but the values are not copied they are referenced.If you add a new value to parent_copy it won’t effect parent because parent_copy is a new dictionary not reference.
parent = {1: [1,2,3]}
parent_copy = parent.copy()
parent_reference = parent
print id(parent),id(parent_copy),id(parent_reference)
#140690938288400 140690938290536 140690938288400
print id(parent[1]),id(parent_copy[1]),id(parent_reference[1])
#140690938137128 140690938137128 140690938137128
parent_copy[1].append(4)
parent_copy[2] = ['new']
print parent, parent_copy, parent_reference
#{1: [1, 2, 3, 4]} {1: [1, 2, 3, 4], 2: ['new']} {1: [1, 2, 3, 4]}
The hash(id) value of parent[1], parent_copy[1] are identical which implies [1,2,3] of parent[1] and parent_copy[1] stored at id 140690938288400.
But hash of parent and parent_copy are different which implies
They are different dictionaries and parent_copy is a new dictionary having values reference to values of parent
In your second part, you should use new = original.copy()
.copy
and =
are different things.