Why should you manually run a garbage collection in python?


I have occasionally come across code that manually calls gc.collect(), but it’s unclear why. For what reasons, if any, would it be advantageous to manually run a garbage collection, rather than let Python handle it automatically?

Asked By: Brendan Abel



In short, efficiency.

If you are using a pointer to reference something, it is using memory. Even when you stop pointing at it, it still takes up memory for a short amount of time.

Python’s garbage collection used to use reference counting. Reference counting is a smart way to tell when something should no longer need memory (by checking how often it is referenced by other objects).

Unfortunately, reference counting is still a calculation so it reduces efficiency to run it all the time (especially when it is not needed), so instead Python now uses scheduled garbage collection.

The issue is that in some specific functionality Python now doesn’t garbage collect immediately when it notices something is no longer used (as in the case with reference counting) because it garbage collects periodically instead.

You generally need to run garbage collection manually in a lot of server applications or large scale calculations where minor infractions in garbage collection matter to free memory.

Here is a link from Digi.com that sums it up perfectly if my explanation isn’t clear. I hope this helps Brendan!

Answered By: mmghu

CPython (the Python implementation you are using most likely) does do both automatic reference counting and garbage collection.

Reference counting causes most objects in memory to be freed immediately when they’re not needed any longer (no reference/variable pointing to the object).

Garbage collection will only be involved if you have cyclic references. CPython will collect "islands" of objects that are not referenced anymore on a regular basis.

You rarely need to get involved with Python memory management. The explicit call to gc.collect() makes sure that garbage collection (not reference counting!) is run immediately. You probably don’t need this though, neither do most code bases that have calls to gc.collect().

Answered By: Jonas H.
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.