Python: memory usage statistics per object-types (or source code line)

Question:

I am doing some heavy calculations with Python (using OpenCV and Numpy) and in the end, I end up with a lot of memory usage (>1GB) whereby all refs should be gone and I only have the end-result (which should not be more than a few MB).

To debug this, it would be nice if I could get some stats somehow which show me how much object instances there are of what type, ordered by the total amount of memory they take (per object class).

Or even nicer: Not per object class but per source code line where the object was created (whereby I guess this info is not available unless I activate some debugging in Python which would make the calculation too slow, so I am not sure if that would be helpful).

Can I get some stats like this somehow? Or how would I debug this?


Some has missunderstood me: I only need to know how to debug the memory usage. Processing/run- time is perfect.

Asked By: Albert

||

Answers:

I think you’re searching for a python profiler ;

you have a bunch of them that you can use , like Heapy, profile or cprofile , Pysize

example using Heapy :

you have to include this snippet somewhere in your code:

from guppy import hpy
h = hpy()
print h.heap()

and it will give you as output:

Partition of a set of 132527 objects. Total size = 8301532 bytes.
Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
0  35144  27  2140412  26   2140412  26 str
1  38397  29  1309020  16   3449432  42 tuple
2    530   0   739856   9   4189288  50 dict (no owner)

example with cprofile :

you can run it like this:

python -m cProfile script.py

Output:

         5 function calls in 0.000 CPU seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.000    0.000 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 myscript.py:1(<module>)
        1    0.000    0.000    0.000    0.000 {execfile}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        1    0.000    0.000    0.000    0.000 {range}

You can also use gc module to know why python is not freeing your memory, and to ask him to free memory using gc.collect().

By the way have you looked at numpy, i think it more suitable if you’re doing heavy calculation like you said.

Answered By: mouad

Ok, I hunted it down. As none of the Python mem profiles give any helpful output (because they couldn’t find the memory), I was quite sure that some of the external libs (OpenCV) were the source of the mem leak.

And I could reproduce the mem leak with this simple code:

import cv
while True: cv.CreateHist([40], cv.CV_HIST_ARRAY, [[0,255]], 1)

Some of the other resources for Python mem debugging which were quite interesting (didn’t helped in that case but may be useful for others):

Answered By: Albert