Why disable the garbage collector?

Question:

Pythons gc.disable disables automatic garbage collection. As I understand it, that would have quite some side-effects. Why would anyone want to disable automatic garbage collection, and how could one effectively manage memory without it?

Asked By: gerrit

||

Answers:

One use for disabling the garbage collector is to get more consistent results when timing the performance of code. The timeit module does this.

def timeit(self, number=default_number):
    if itertools:
        it = itertools.repeat(None, number)
    else:
        it = [None] * number
    gcold = gc.isenabled()
    gc.disable()
    ...

In Python2 and up to Python3.2 gc.disable() is also used to avoid a bug caused by garbage collection occurring between fork and exec. The problem seems to have been fixed in Python3.3 without needing to call gc.disable().

Answered By: unutbu

Another use-case would be to manually control the garbage collection with gc.collect()

Answered By: James Mills

From the same page you link to:

Since the collector supplements the reference counting already used in
Python, you can disable the collector if you are sure your program
does not create reference cycles.

So that answers the second part of the question, “how could one effectively manage memory without it”. Don’t create reference cycles. It’s a fairly limited use case, sure.

For the first part of the question the answer is performance. Again, a fairly limited use case.

Disabling GC would only help if (a) the GC is actually doing work, and (b) that work is achieving nothing, that is to say it’s finding nothing to free, or finding so little that you think your program can tolerate the leak for as long as GC is disabled. So, if your program is too slow and doesn’t create reference cycles and disabling GC appears to speed it up, then you would consider disabling GC.

I speculate (based on previous GC that I’ve seen, not Python’s in particular) that if you don’t allocate any memory then the garbage collector won’t have any long-term performance cost. It might have some short-term and unpredictable cost tidying up what has gone before. So even in the case where you’re going into a massive numpy number-crunching routine and think you should look to squeeze all possible performance out of that part of the code, disabling GC while you do it still wouldn’t help. It will just delay the time cost of tidying up previous reference cycles until after you re-enable GC.

Arguably, programs that run for a short time and don’t use much memory do not need garbage collection, they can tolerate leaks. But even more arguably, if you start out thinking like that you will eventually get into trouble with a program that leaks more memory than you expected.

Answered By: Steve Jessop

The problem with an enabled GC always is that you do not know when it will happen. So if (a part of) your program is time-critical, needs real-time, etc., then you can disable the GC for the time (that part of) your program runs.

Whether you want to switch the automatic GC on again later or if you prefer to do it manually by calling gc.collect() is of no concern to that question.

Also, some programs are designed to run only a very short time, so that maybe the developer can assure that there cannot occur any memory problem during that time (consider programs like ls); then that whole GC aspect can be neglected in favor of performance.

Answered By: Alfe
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.