list memory usage in ipython and jupyter

Question:

I have a few (almost ten) Gb of memory taken by the ipython kernel. I think this is coming from large objects (matrices, lists, numpy arrays, …) that I might have produced during some operation and now I do not need anymore.

I would like to list all of the objects I have defined and sort them by their memory footprint. Is there a simple way to do that? For certain types there is nbytes method, but not for all … so I am looking for a general way to list all objects I have made and their memory occupation.

Asked By: Rho Phi

||

Answers:

Assuming that you are using ipython or jupyter, you will need to do a little bit of work to get a list all of the objects you have defined. That means taking everything available in globals() and filtering out objects that are modules, builtins, ipython objects, etc. Once you are sure you have those objects, then you can proceed to grabbing their sizes with sys.getsizeof. This can be summed up as follows:

import sys

# These are the usual ipython objects, including this one you are creating
ipython_vars = ['In', 'Out', 'exit', 'quit', 'get_ipython', 'ipython_vars']

# Get a sorted list of the objects and their sizes
sorted([(x, sys.getsizeof(globals().get(x))) for x in dir() if not x.startswith('_') and x not in sys.modules and x not in ipython_vars], key=lambda x: x[1], reverse=True)

Please keep in mind that for python objects (those created with python’s builtin functions), sys.getsizeof will be very accurate. But it can be a bit inaccurate on objects created using third-party libraries. Furthermore, please be mindful that sys.getsizeof adds an additional garbage collector overhead if the object is managed by the garbage collector. So, some things may look a bit heavier than they actually are.

As a side note, numpy‘s .nbytes method can be somewhat misleading in that it does not include memory consumed by non-element attributes of the array object.

Answered By: Abdou

I like the answer @Abdou provided! I would only add the following suggestion. Instead of a list of tuples, I would convert it to a dictionary.

import sys

# These are the usual ipython objects, including this one you are creating
ipython_vars = ["In", "Out", "exit", "quit", "get_ipython", "ipython_vars"]

# Get a sorted list of the objects and their sizes
mem = {
    key: value
    for key, value in sorted(
        [
            (x, sys.getsizeof(globals().get(x)))
            for x in dir()
            if not x.startswith("_") and x not in sys.modules and x not in ipython_vars
        ],
        key=lambda x: x[1],
        reverse=True,
    )
}

Then if I wanted to get the total amount in MBs, all I’d have to do is:

sum(mem.values()) / 1e6
Answered By: Daniel Cárdenas