# What is the runtime complexity of Python's deepcopy()?

## Question:

I’m trying to improve the speed of an algorithm and, after looking at which operations are being called, I’m having difficulty pinning down exactly what’s slowing things up. I’m wondering if Python’s deepcopy() could possibly be the culprit or if I should look a little further into my own code.

The complexity of `deepcopy()` is dependant upon the size (number of elements/children) of the object being copied.

If your algorithm’s inputs do not affect the size of the object(s) being copied, then you should consider the call to `deepcopy()` to be `O(1)` for the purposes of determining complexity, since each invocation’s execution time is relatively static.

(If your algorithm’s inputs do have an effect on the size of the object(s) being copied, you’ll have to elaborate how. Then the complexity of the algorithm can be evaluated.)

What are you using `deepcopy` for? As the name suggests, `deepcopy` copies the object, and all subobjects recursively, so it is going to take an amount of time proportional to the size of the object you are copying. (with a bit of overhead to deal with circular references)

There isn’t really any way to speed it up, if you are going to copy everything, you need to copy everything.

One question to ask, is do you need to copy everything, or can you just copy part of the structure.

Looking at the code (you can too), it goes through every object in the tree of referenced objects (e.g. dict’s keys and values, object member variables, …) and does two things for them:

1. see if it’s already been copied, by looking it in id-indexed `memo` dict
2. copy of the object if not

The second one is O(1) for simple objects. For composite objects, the same routine handles them, so over all n objects in the tree, that’s O(n). The first part, looking an object up in a dict, is O(1) on average, but O(n) amortized worst case.

So at best, on average, `deepcopy` is linear. The keys used in `memo` are `id()` values, i.e. memory locations, so they are not randomly distributed over the key space (the “average” part above) and it may behave worse, up to the O(n^2) worst case. I did observe some performance degradations in real use, but for the most part, it behaved as linear.

That’s the complexity part, but the constant is large and `deepcopy` is anything but cheap and could very well be causing your problems. The only sure way to know is to use a profiler — do it. FWIW, I’m currently rewriting terribly slow code that spends 98% of its execution time in `deepcopy`.

If you see the source code for copy module, you can see that the function `deepcopy()` processing actually depends on the type of input provided.

For example, if the input type is a list, then the processing is done by the `list.copy` function. Which has a time complexity of `O(n)`.
Similarly, for set, its also `O(n)`.

Categories: questions
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.