Python Basics: Shallow Copy and Deep Copy
Ziyu Zhou
Python basics: Understand the differences between shallow copy and deep copy, as well as when and how should we use them.
Table of Contents:
- Problem that You Might Not Expect
- How Assignments Work in Python
- What Will Happen If We Copy a List
- Shallow Copy & Deep Copy
- References
Problem that You Might Not Expect
Guess what’s the output of this program:
>>> list_a = ['a', 'b']
>>> list_b = list_a
>>> list_b[0] = 'c'
>>> print(list_b)
>>> print(list_a)
You might expect the result to be:
>>> print(list_b)
['c', 'b']
>>> print(list_a)
['a', 'b']
However, the actual result is:
>>> print(list_b)
['c', 'b']
>>> print(list_a)
['c', 'b']
The value of list_a
changes as list_b
changes! This blog is to answer why this happens and how we can slove it.
How Assignments Work in Python
Before we talk about copies, let’s quickly go over how assignments work in Python first so that we can better understand the following sections. Assigning and copying data types like integer or list seem to be very simple in Python, like this:
>>> x = 5
>>> y = x
>>> x
5
>>> y
5
However, what actually happens is not that simple. y
is not a “new” variable, because Python will let it points to the memory location of x
, which means that y
and x
are just two identifiers of the same variable whose value is 5. The figure below shows how it works:
x ⟶ memory location where value 3 is stored
y ↗
We can confirm this by checking their identities using the id()
function. Only unique object or variable will have its own identity. Here we can see that x
and y
have the same id, that being said, they are the same object:
>>> id(x)
4297637024
>>> id(y)
4297637024
However, as soon as y
is assigned to a different value, Python will give it its own memory location. Now x
and y
are different variables, and thus changing the value of y
won’t affect x
:
>>> y = 1
>>> y
1
>>> x
5
>>> id(y)
4297636896
>>> id(x)
4297637024
Note: Assignment statements in Python do not copy objects, they create bindings between a target and an object.
What Will Happen If We Copy a List
According to the section above, copying a list and then assigning a different value to the new one is supposed to work smoothly:
>>> list_a = ['a', 'b']
>>> list_b = list_a
>>> list_b = ['e', 'f']
>>> print(list_b)
['e', 'f']
>>> print(list_a)
['a', 'b']
But this is problematic:
>>> list_a = ['a', 'b']
>>> list_b = list_a
>>> list_b[0] = 'c'
>>> print(list_b)
['c', 'b']
>>> print(list_a)
['c', 'b']
This is because changing a value of a list, which is a compound object (objects that contain other objects, like lists that contain strings), will not give this list a new memory location. That is, after executing list_b[0] = 'c'
, list_b
still points to the same location as list_a
.
This is the situation where we need “actual” copy, so one can change one copy without changing the other.
Shallow Copy & Deep Copy
Shallow copy and Deep copy are both used for the “actual” copy purpose.
Differences between Shallow Copy and Deep Copy
According to the Python documentation, the difference between shallow and deep copying is only relevant for compound objects.
- A shallow copy constructs a new compound object and then (to the extent possible) inserts references into it to the objects found in the original.
- A deep copy constructs a new compound object and then, recursively, inserts copies into it of the objects found in the original.
How to Perform Shallow Copy
There are three ways to perform a shallow copy:
Using the copy
Module
# Return a shallow copy of x.
import copy
copy.copy(x)
Calling the Factory Functions
Python’s built-in mutable collections like lists, dicts, and sets can be copied by calling their factory functions on an existing collection.
new_list = list(original_list)
Using List Slicing
>>> list1 = ['a','b','c','d']
>>> list2 = list1[:]
>>> list2[1] = 'x'
>>> print(list2)
['a', 'x', 'c', 'd']
>>> print(list1) # list1 is not affected by list2
['a', 'b', 'c', 'd']
Perfect. However, what happens if our list is a nested one? For example:
>>> list1 = ['a','b',['ccc','ddd']]
>>> list2 = list1[:]
>>> list2[2][0] = 'E'
>>> print(list2)
['a', 'b', ['E', 'ddd']]
>>> print(list1) # Oops!
['a', 'b', ['E', 'ddd']]
For nested lists, we need to recursively copy it. In other words, we have to perform deep copy.
Note: shallow copy is only one level deep. The copying process does not recurse and therefore won’t create copies of the child objects themselves.
How to Perform Deep Copy
We’ll need to use the copy
module:
# Return a deep copy of x.
import copy
copy.deepcopy(x)
Let’s try the above code using deep copy:
>>> import copy
>>> list1 = ['a','b',['ccc','ddd']]
>>> list2 = copy.deepcopy(list1)
>>> list2[2][0] = 'E'
>>> list2
['a', 'b', ['E', 'ddd']]
>>> list1
['a', 'b', ['ccc', 'ddd']]
Now we’re good to go 😄!