What is the difference between shallow copy, deepcopy and normal assignment operation?

Question:

import copy

a = "deepak"
b = 1, 2, 3, 4
c = [1, 2, 3, 4]
d = {1: 10, 2: 20, 3: 30}

a1 = copy.copy(a)
b1 = copy.copy(b)
c1 = copy.copy(c)
d1 = copy.copy(d)


print("immutable - id(a)==id(a1)", id(a) == id(a1))
print("immutable - id(b)==id(b1)", id(b) == id(b1))
print("mutable - id(c)==id(c1)", id(c) == id(c1))
print("mutable - id(d)==id(d1)", id(d) == id(d1))

I get the following results:

immutable - id(a)==id(a1) True
immutable - id(b)==id(b1) True
mutable - id(c)==id(c1) False
mutable - id(d)==id(d1) False

If I perform deepcopy:

a1 = copy.deepcopy(a)
b1 = copy.deepcopy(b)
c1 = copy.deepcopy(c)
d1 = copy.deepcopy(d)

results are the same:

immutable - id(a)==id(a1) True
immutable - id(b)==id(b1) True
mutable - id(c)==id(c1) False
mutable - id(d)==id(d1) False

If I work on assignment operations:

a1 = a
b1 = b
c1 = c
d1 = d

then results are:

immutable - id(a)==id(a1) True
immutable - id(b)==id(b1) True
mutable - id(c)==id(c1) True
mutable - id(d)==id(d1) True

Can somebody explain what exactly makes a difference between the copies? Is it something related to mutable & immutable objects? If so, can you please explain it to me?

Asked By: deeshank

||

Answers:

For immutable objects, there is no need for copying because the data will never change, so Python uses the same data; ids are always the same. For mutable objects, since they can potentially change, [shallow] copy creates a new object.

Deep copy is related to nested structures. If you have list of lists, then deepcopy copies the nested lists also, so it is a recursive copy. With just copy, you have a new outer list, but inner lists are references.

Assignment does not copy. It simply sets the reference to the old data. So you need copy to create a new list with the same contents.

Answered By: perreal

Normal assignment operations will simply point the new variable towards the existing object. The docs explain the difference between shallow and deep copies:

The difference between shallow and deep copying is only relevant for
compound objects (objects that contain other objects, like lists or
class instances):

  • A shallow copy constructs a new compound object and then (to the extent possible) inserts references into it to the objects found in the original.

  • A deep copy constructs a new compound object and then, recursively, inserts copies into it of the objects found in the
    original.

Here’s a little demonstration:

import copy

a = [1, 2, 3]
b = [4, 5, 6]
c = [a, b]

Using normal assignment operatings to copy:

d = c

print id(c) == id(d)          # True - d is the same object as c
print id(c[0]) == id(d[0])    # True - d[0] is the same object as c[0]

Using a shallow copy:

d = copy.copy(c)

print id(c) == id(d)          # False - d is now a new object
print id(c[0]) == id(d[0])    # True - d[0] is the same object as c[0]

Using a deep copy:

d = copy.deepcopy(c)

print id(c) == id(d)          # False - d is now a new object
print id(c[0]) == id(d[0])    # False - d[0] is now a new object
Answered By: grc

a, b, c, d, a1, b1, c1 and d1 are references to objects in memory, which are uniquely identified by their ids.

An assignment operation takes a reference to the object in memory and assigns that reference to a new name. c=[1,2,3,4] is an assignment that creates a new list object containing those four integers, and assigns the reference to that object to c. c1=c is an assignment that takes the same reference to the same object and assigns that to c1. Since the list is mutable, anything that happens to that list will be visible regardless of whether you access it through c or c1, because they both reference the same object.

c1=copy.copy(c) is a “shallow copy” that creates a new list and assigns the reference to the new list to c1. c still points to the original list. So, if you modify the list at c1, the list that c refers to will not change.

The concept of copying is irrelevant to immutable objects like integers and strings. Since you can’t modify those objects, there is never a need to have two copies of the same value in memory at different locations. So integers and strings, and some other objects to which the concept of copying does not apply, are simply reassigned. This is why your examples with a and b result in identical ids.

c1=copy.deepcopy(c) is a “deep copy”, but it functions the same as a shallow copy in this example. Deep copies differ from shallow copies in that shallow copies will make a new copy of the object itself, but any references inside that object will not themselves be copied. In your example, your list has only integers inside it (which are immutable), and as previously discussed there is no need to copy those. So the “deep” part of the deep copy does not apply. However, consider this more complex list:

e = [[1, 2],[4, 5, 6],[7, 8, 9]]

This is a list that contains other lists (you could also describe it as a two-dimensional array).

If you run a “shallow copy” on e, copying it to e1, you will find that the id of the list changes, but each copy of the list contains references to the same three lists — the lists with integers inside. That means that if you were to do e[0].append(3), then e would be [[1, 2, 3],[4, 5, 6],[7, 8, 9]]. But e1 would also be [[1, 2, 3],[4, 5, 6],[7, 8, 9]]. On the other hand, if you subsequently did e.append([10, 11, 12]), e would be [[1, 2, 3],[4, 5, 6],[7, 8, 9],[10, 11, 12]]. But e1 would still be [[1, 2, 3],[4, 5, 6],[7, 8, 9]]. That’s because the outer lists are separate objects that initially each contain three references to three inner lists. If you modify the inner lists, you can see those changes no matter if you are viewing them through one copy or the other. But if you modify one of the outer lists as above, then e contains three references to the original three lists plus one more reference to a new list. And e1 still only contains the original three references.

A ‘deep copy’ would not only duplicate the outer list, but it would also go inside the lists and duplicate the inner lists, so that the two resulting objects do not contain any of the same references (as far as mutable objects are concerned). If the inner lists had further lists (or other objects such as dictionaries) inside of them, they too would be duplicated. That’s the ‘deep’ part of the ‘deep copy’.

Answered By: Andrew Gorcester

Let’s see in a graphical example how the following code is executed:

import copy

class Foo(object):
    def __init__(self):
        pass


a = [Foo(), Foo()]
shallow = copy.copy(a)
deep = copy.deepcopy(a)

enter image description here

Answered By: user1767754

For immutable objects, creating a copy don’t make much sense since they are not going to change. For mutable objects assignment,copy and deepcopy behaves differently. Lets talk about each of them with examples.

An assignment operation simply assigns the reference of source to destination e.g:

>>> i = [1,2,3]
>>> j=i
>>> hex(id(i)), hex(id(j))
>>> ('0x10296f908', '0x10296f908') #Both addresses are identical

Now i and j technically refers to same list. Both i and j have same memory address. Any updation to either
of them will be reflected to the other. e.g:

>>> i.append(4)
>>> j
>>> [1,2,3,4] #Destination is updated

>>> j.append(5)
>>> i
>>> [1,2,3,4,5] #Source is updated

On the other hand copy and deepcopy creates a new copy of variable. So now changes to original variable will not be reflected
to the copy variable and vice versa. However copy(shallow copy), don’t creates a copy of nested objects, instead it just
copies the reference of nested objects. Deepcopy copies all the nested objects recursively.

Some examples to demonstrate behaviour of copy and deepcopy:

Flat list example using copy:

>>> import copy
>>> i = [1,2,3]
>>> j = copy.copy(i)
>>> hex(id(i)), hex(id(j))
>>> ('0x102b9b7c8', '0x102971cc8') #Both addresses are different

>>> i.append(4)
>>> j
>>> [1,2,3] #Updation of original list didn't affected copied variable

Nested list example using copy:

>>> import copy
>>> i = [1,2,3,[4,5]]
>>> j = copy.copy(i)

>>> hex(id(i)), hex(id(j))
>>> ('0x102b9b7c8', '0x102971cc8') #Both addresses are still different

>>> hex(id(i[3])), hex(id(j[3]))
>>> ('0x10296f908', '0x10296f908') #Nested lists have same address

>>> i[3].append(6)
>>> j
>>> [1,2,3,[4,5,6]] #Updation of original nested list updated the copy as well

Flat list example using deepcopy:

>>> import copy
>>> i = [1,2,3]
>>> j = copy.deepcopy(i)
>>> hex(id(i)), hex(id(j))
>>> ('0x102b9b7c8', '0x102971cc8') #Both addresses are different

>>> i.append(4)
>>> j
>>> [1,2,3] #Updation of original list didn't affected copied variable

Nested list example using deepcopy:

>>> import copy
>>> i = [1,2,3,[4,5]]
>>> j = copy.deepcopy(i)

>>> hex(id(i)), hex(id(j))
>>> ('0x102b9b7c8', '0x102971cc8') #Both addresses are still different

>>> hex(id(i[3])), hex(id(j[3]))
>>> ('0x10296f908', '0x102b9b7c8') #Nested lists have different addresses

>>> i[3].append(6)
>>> j
>>> [1,2,3,[4,5]] #Updation of original nested list didn't affected the copied variable    
Answered By: Sohaib Farooqi

Below code demonstrates the difference between assignment, shallow copy using the copy method, shallow copy using the (slice) [:] and the deepcopy. Below example uses nested lists there by making the differences more evident.

from copy import deepcopy

########"List assignment (does not create a copy) ############
l1 = [1,2,3, [4,5,6], [7,8,9]]
l1_assigned = l1

print(l1)
print(l1_assigned)

print(id(l1), id(l1_assigned))
print(id(l1[3]), id(l1_assigned[3]))
print(id(l1[3][0]), id(l1_assigned[3][0]))

l1[3][0] = 100
l1.pop(4)
l1.remove(1)


print(l1)
print(l1_assigned)
print("###################################")

########"List copy using copy method (shallow copy)############

l2 = [1,2,3, [4,5,6], [7,8,9]]
l2_copy = l2.copy()

print(l2)
print(l2_copy)

print(id(l2), id(l2_copy))
print(id(l2[3]), id(l2_copy[3]))
print(id(l2[3][0]), id(l2_copy[3][0]))
l2[3][0] = 100
l2.pop(4)
l2.remove(1)


print(l2)
print(l2_copy)

print("###################################")

########"List copy using slice (shallow copy)############

l3 = [1,2,3, [4,5,6], [7,8,9]]
l3_slice = l3[:]

print(l3)
print(l3_slice)

print(id(l3), id(l3_slice))
print(id(l3[3]), id(l3_slice[3]))
print(id(l3[3][0]), id(l3_slice[3][0]))

l3[3][0] = 100
l3.pop(4)
l3.remove(1)


print(l3)
print(l3_slice)

print("###################################")

########"List copy using deepcopy ############

l4 = [1,2,3, [4,5,6], [7,8,9]]
l4_deep = deepcopy(l4)

print(l4)
print(l4_deep)

print(id(l4), id(l4_deep))
print(id(l4[3]), id(l4_deep[3]))
print(id(l4[3][0]), id(l4_deep[3][0]))

l4[3][0] = 100
l4.pop(4)
l4.remove(1)

print(l4)
print(l4_deep)
print("##########################")
print(l4[2], id(l4[2]))
print(l4_deep[3], id(l4_deep[3]))

print(l4[2][0], id(l4[2][0]))
print(l4_deep[3][0], id(l4_deep[3][0]))
Answered By: Sandeep

The GIST to take is this:
Dealing with shallow lists (no sub_lists, just single elements) using “normal assignment” rises a “side effect” when you create a shallow list and then you create a copy of this list using “normal assignment”. This “side effect” is when you change any element of the copy list created, because it will automatically change the same elements of the original list. That is when copy comes in handy, as it won’t change the original list elements when changing the copy elements.

On the other hand, copy does have a “side effect” as well, when you have a list that has lists in it (sub_lists), and deepcopy solves it. For instance if you create a big list that has nested lists in it (sub_lists), and you create a copy of this big list (the original list). The “side effect” would arise when you modify the sub_lists of the copy list which would automatically modify the sub_lists of the big list. Sometimes (in some projects) you want to keep the big list (your original list) as it is without modification, and all you want is to make a copy of its elements (sub_lists). For that, your solution is to use deepcopy which will take care of this “side effect” and makes a copy without modifying the original content.

The different behaviors of copy and deep copy operations concerns only compound objects (ie: objects that contain other objects such as lists).

Here are the differences illustrated in this simple code example:

First

let’s check how copy (shallow) behaves, by creating an original list and a copy of this list:

import copy
original_list = [1, 2, 3, 4, 5, ['a', 'b']]
copy_list = copy.copy(original_list)

Now, let’s run some print tests and see how the original list behave compared to its copy list:

original_list and copy_list have different addresses

print(hex(id(original_list)), hex(id(copy_list))) # 0x1fb3030 0x1fb3328

elements of original_list and copy_list have the same addresses

print(hex(id(original_list[1])), hex(id(copy_list[1]))) # 0x537ed440 0x537ed440

sub_elements of original_list and copy_list have the same addresses

print(hex(id(original_list[5])), hex(id(copy_list[5]))) # 0x1faef08 0x1faef08

modifying original_list elements does NOT modify copy_list elements

original_list.append(6)
print("original_list is:", original_list) # original_list is: [1, 2, 3, 4, 5, ['a', 'b'], 6]
print("copy_list is:", copy_list) # copy_list is: [1, 2, 3, 4, 5, ['a', 'b']]

modifying copy_list elements does NOT modify original_list elements

copy_list.append(7)
print("original_list is:", original_list) # original_list is: [1, 2, 3, 4, 5, ['a', 'b'], 6]
print("copy_list is:", copy_list) # copy_list is: [1, 2, 3, 4, 5, ['a', 'b'], 7]

modifying original_list sub_elements automatically modify copy_list sub_elements

original_list[5].append('c')
print("original_list is:", original_list) # original_list is: [1, 2, 3, 4, 5, ['a', 'b', 'c'], 6]
print("copy_list is:", copy_list) # copy_list is: [1, 2, 3, 4, 5, ['a', 'b', 'c'], 7]

modifying copy_list sub_elements automatically modify original_list sub_elements

copy_list[5].append('d')
print("original_list is:", original_list) # original_list is: [1, 2, 3, 4, 5, ['a', 'b', 'c', 'd'], 6]
print("copy_list is:", copy_list) # copy_list is: [1, 2, 3, 4, 5, ['a', 'b', 'c', 'd'], 7]

Second

let’s check how deepcopy behaves, by doing the same thing as we did with copy (creating an original list and a copy of this list):

import copy
original_list = [1, 2, 3, 4, 5, ['a', 'b']]
copy_list = copy.copy(original_list)

Now, let’s run some print tests and see how the original list behave compared to its copy list:

import copy
original_list = [1, 2, 3, 4, 5, ['a', 'b']]
copy_list = copy.deepcopy(original_list)

original_list and copy_list have different addresses

print(hex(id(original_list)), hex(id(copy_list))) # 0x1fb3030 0x1fb3328

elements of original_list and copy_list have the same addresses

print(hex(id(original_list[1])), hex(id(copy_list[1]))) # 0x537ed440 0x537ed440

sub_elements of original_list and copy_list have different addresses

print(hex(id(original_list[5])), hex(id(copy_list[5]))) # 0x24eef08 0x24f3300

modifying original_list elements does NOT modify copy_list elements

original_list.append(6)
print("original_list is:", original_list) # original_list is: [1, 2, 3, 4, 5, ['a', 'b'], 6]
print("copy_list is:", copy_list) # copy_list is: [1, 2, 3, 4, 5, ['a', 'b']]

modifying copy_list elements does NOT modify original_list elements

copy_list.append(7)
print("original_list is:", original_list) # original_list is: [1, 2, 3, 4, 5, ['a', 'b'], 6]
print("copy_list is:", copy_list) # copy_list is: [1, 2, 3, 4, 5, ['a', 'b'], 7]

modifying original_list sub_elements does NOT modify copy_list sub_elements

original_list[5].append('c')
print("original_list is:", original_list) # original_list is: [1, 2, 3, 4, 5, ['a', 'b', 'c'], 6]
print("copy_list is:", copy_list) # copy_list is: [1, 2, 3, 4, 5, ['a', 'b'], 7]

modifying copy_list sub_elements does NOT modify original_list sub_elements

copy_list[5].append('d')
print("original_list is:", original_list) # original_list is: [1, 2, 3, 4, 5, ['a', 'b', 'c', 'd'], 6]
print("copy_list is:", copy_list) # copy_list is: [1, 2, 3, 4, 5, ['a', 'b', 'd'], 7]
Answered By: Fouad Boukredine

In python, when we assign objects like list, tuples, dict, etc to another object usually with a ‘ = ‘ sign, python creates copy’s by reference. That is, let’s say we have a list of list like this :

list1 = [ [ 'a' , 'b' , 'c' ] , [ 'd' , 'e' , 'f' ]  ]

and we assign another list to this list like :

list2 = list1

then if we print list2 in python terminal we’ll get this :

list2 = [ [ 'a', 'b', 'c'] , [ 'd', 'e', ' f ']  ]

Both list1 & list2 are pointing to same memory location, any change to any one them will result in changes visible in both objects, i.e both objects are pointing to same memory location.
If we change list1 like this :

list1[0][0] = 'x’
list1.append( [ 'g'] )

then both list1 and list2 will be :

list1 = [ [ 'x', 'b', 'c'] , [ 'd', 'e', ' f '] , [ 'g'] ]
list2 = [ [ 'x', 'b', 'c'] , [ 'd', 'e', ' f '] , [ 'g’ ] ]

Now coming to Shallow copy, when two objects are copied via shallow copy, the child object of both parent object refers to same memory location but any further new changes in any of the copied object will be independent to each other.
Let’s understand this with a small example. Suppose we have this small code snippet :

import copy

list1 = [ [ 'a', 'b', 'c'] , [ 'd', 'e', ' f ']  ]      # assigning a list
list2 = copy.copy(list1)       # shallow copy is done using copy function of copy module

list1.append ( [ 'g', 'h', 'i'] )   # appending another list to list1

print list1
list1 = [ [ 'a', 'b', 'c'] , [ 'd', 'e', ' f '] , [ 'g', 'h', 'i'] ]
list2 = [ [ 'a', 'b', 'c'] , [ 'd', 'e', ' f '] ]

notice, list2 remains unaffected, but if we make changes to child objects like :

list1[0][0] = 'x’

then both list1 and list2 will get change :

list1 = [ [ 'x', 'b', 'c'] , [ 'd', 'e', ' f '] , [ 'g', 'h', 'i'] ] 
list2 = [ [ 'x', 'b', 'c'] , [ 'd', 'e', ' f '] ]

Now, Deep copy helps in creating completely isolated objects out of each other. If two objects are copied via Deep Copy then both parent & it’s child will be pointing to different memory location.
Example :

import copy

list1 = [ [ 'a', 'b', 'c'] , [ 'd', 'e', ' f ']  ]         # assigning a list
list2 = deepcopy.copy(list1)       # deep copy is done using deepcopy function of copy module

list1.append ( [ 'g', 'h', 'i'] )   # appending another list to list1

print list1
list1 = [ [ 'a', 'b', 'c'] , [ 'd', 'e', ' f '] , [ 'g', 'h', 'i'] ]
list2 = [ [ 'a', 'b', 'c'] , [ 'd', 'e', ' f '] ]

notice, list2 remains unaffected, but if we make changes to child objects like :

list1[0][0] = 'x’

then also list2 will be unaffected as all the child objects and parent object points to different memory location :

list1 = [ [ 'x', 'b', 'c'] , [ 'd', 'e', ' f '] , [ 'g', 'h', 'i'] ] 
list2 = [ [ 'a', 'b', 'c'] , [ 'd', 'e', ' f  ' ] ]

Hope it helps.

Answered By: Nitish Chauhan
>>lst=[1,2,3,4,5]

>>a=lst

>>b=lst[:]

>>> b
[1, 2, 3, 4, 5]

>>> a
[1, 2, 3, 4, 5]

>>> lst is b
False

>>> lst is a
True

>>> id(lst)
46263192

>>> id(a)
46263192 ------>  See here id of a and id of lst is same so its called deep copy and even boolean answer is true

>>> id(b)
46263512 ------>  See here id of b and id of lst is not same so its called shallow copy and even boolean answer is false although output looks same.
Answered By: sudhir tataraju

Not sure if it mentioned above or not, but it’s very importable to undestand that .copy() create reference to original object. If you change copied object – you change the original object.
.deepcopy() creates new object and does real copying of original object to new one. Changing new deepcopied object doesn’t affect original object.

And yes, .deepcopy() copies original object recursively, while .copy() create a reference object to first-level data of original object.

So the copying/referencing difference between .copy() and .deepcopy() is significant.

Answered By: remort

Deep copy is related to nested structures. If you have list of lists, then deepcopy copies the nested lists also, so it is a recursive copy. With just copy, you have a new outer list, but inner lists are references. Assignment does not copy.
For Ex

import copy
spam = [[0, 1, 2, 3], 4, 5]
cheese = copy.copy(spam)
cheese.append(3)
cheese[0].append(3)
print(spam)
print(cheese)

OutPut

[[0, 1, 2, 3, 3], 4, 5]
[[0, 1, 2, 3, 3], 4, 5, 3]
Copy method copy content of outer list to new list but inner list is still same for both list so if you make changes in inner list of any lists it will affects both list.

But if you use Deep copy then it will create new instance for inner list too.

import copy
spam = [[0, 1, 2, 3], 4, 5]
cheese = copy.deepcopy(spam)
cheese.append(3)
cheese[0].append(3)
print(spam)
print(cheese)

Output

[0, 1, 2, 3]
[[0, 1, 2, 3, 3], 4, 5, 3]

Answered By: Shubham Agarwal

The following code shows how underlying addresses are affected in copy, deepcopy and assignment. This is similar to what Sohaib Farooqi showed with lists, but with classes.

from copy import deepcopy, copy

class A(object):
    """docstring for A"""
    def __init__(self):
        super().__init__()

class B(object):
    """docstring for B"""
    def __init__(self):
        super().__init__()
        self.myA = A()

a = B()
print("a is", a)
print("a.myA is", a.myA)
print("After copy")
b = copy(a)
print("b is", b)
print("b.myA is", b.myA)
b.myA = A()
print("-- after changing value")
print("a is", a)
print("a.myA is", a.myA)
print("b is", b)
print("b.myA is", b.myA)

print("Resetting")
print("*"*40)
a = B()
print("a is", a)
print("a.myA is", a.myA)
print("After deepcopy")
b = deepcopy(a)
print("b is", b)
print("b.myA is", b.myA)
b.myA = A()
print("-- after changing value")
print("a is", a)
print("a.myA is", a.myA)
print("b is", b)
print("b.myA is", b.myA)

print("Resetting")
print("*"*40)
a = B()
print("a is", a)
print("a.myA is", a.myA)
print("After assignment")
b = a
print("b is", b)
print("b.myA is", b.myA)
b.myA = A()
print("-- after changing value")
print("a is", a)
print("a.myA is", a.myA)
print("b is", b)
print("b.myA is", b.myA)

The output from this code is as following:

a is <__main__.B object at 0x7f1d8ff59760>
a.myA is <__main__.A object at 0x7f1d8fe8f970>
After copy
b is <__main__.B object at 0x7f1d8fe43280>
b.myA is <__main__.A object at 0x7f1d8fe8f970>
-- after changing value
a is <__main__.B object at 0x7f1d8ff59760>
a.myA is <__main__.A object at 0x7f1d8fe8f970>
b is <__main__.B object at 0x7f1d8fe43280>
b.myA is <__main__.A object at 0x7f1d8fe85820>
Resetting
****************************************
a is <__main__.B object at 0x7f1d8fe85370>
a.myA is <__main__.A object at 0x7f1d8fe43310>
After deepcopy
b is <__main__.B object at 0x7f1d8fde3040>
b.myA is <__main__.A object at 0x7f1d8fde30d0>
-- after changing value
a is <__main__.B object at 0x7f1d8fe85370>
a.myA is <__main__.A object at 0x7f1d8fe43310>
b is <__main__.B object at 0x7f1d8fde3040>
b.myA is <__main__.A object at 0x7f1d8fe43280>
Resetting
****************************************
a is <__main__.B object at 0x7f1d8fe432b0>
a.myA is <__main__.A object at 0x7f1d8fe85820>
After assignment
b is <__main__.B object at 0x7f1d8fe432b0>
b.myA is <__main__.A object at 0x7f1d8fe85820>
-- after changing value
a is <__main__.B object at 0x7f1d8fe432b0>
a.myA is <__main__.A object at 0x7f1d8fe85370>
b is <__main__.B object at 0x7f1d8fe432b0>
b.myA is <__main__.A object at 0x7f1d8fe85370>
Answered By: Gautam Sreekumar