# Python memory usage of numpy arrays

## Question:

I’m using python to analyse some large files and I’m running into memory issues, so I’ve been using sys.getsizeof() to try and keep track of the usage, but it’s behaviour with numpy arrays is bizarre. Here’s an example involving a map of albedos that I’m having to open:

```
>>> import numpy as np
>>> import struct
>>> from sys import getsizeof
>>> f = open('Albedo_map.assoc', 'rb')
>>> getsizeof(f)
144
>>> albedo = struct.unpack('%df' % (7200*3600), f.read(7200*3600*4))
>>> getsizeof(albedo)
207360056
>>> albedo = np.array(albedo).reshape(3600,7200)
>>> getsizeof(albedo)
80
```

Well the data’s still there, but the size of the object, a 3600×7200 pixel map, has gone from ~200 Mb to 80 bytes. I’d like to hope that my memory issues are over and just convert everything to numpy arrays, but I feel that this behaviour, if true, would in some way violate some law of information theory or thermodynamics, or something, so I’m inclined to believe that getsizeof() doesn’t work with numpy arrays. Any ideas?

## Answers:

You can use `array.nbytes`

for numpy arrays, for example:

```
>>> import numpy as np
>>> from sys import getsizeof
>>> a = [0] * 1024
>>> b = np.array(a)
>>> getsizeof(a)
8264
>>> b.nbytes
8192
```

The field nbytes will give you the size in bytes of all the elements of the array in a `numpy.array`

:

```
size_in_bytes = my_numpy_array.nbytes
```

Notice that this does not measures “non-element attributes of the array object” so the actual size in bytes can be a few bytes larger than this.

In python notebooks I often want to filter out ‘dangling’ `numpy.ndarray`

‘s, in particular the ones that are stored in `_1`

, `_2`

, etc that were never really meant to stay alive.

I use this code to get a listing of all of them and their size.

Not sure if `locals()`

or `globals()`

is better here.

```
import sys
import numpy
from humanize import naturalsize
for size, name in sorted(
(value.nbytes, name)
for name, value in locals().items()
if isinstance(value, numpy.ndarray)):
print("{:>30}: {:>8}".format(name, naturalsize(size)))
```

To add more flesh to the accepted answer, summarize and provide a more transparent memory example (note tha `int8`

is one byte):

```
import numpy as np
from sys import getsizeof
a = np.ones(shape=(1000, 1), dtype='int8')
b = a.T
a.nbytes, getsizeof(a), b.nbytes, getsizeof(b), getsizeof(b.base)
```

Will produce the following output:

```
(1000, 1128, 1000, 128, 1128)
```

`a.nbytes`

= 1000: gives size of the numerical elements: 1000 numerical elements.`getsizeof(a) = 1128`

: gives the size of both numerical elements and the reference machinery.`b.nbtyes`

: the size of the numerical elements independently of the location of memory (is not affected by the view status of b)`getsizeof(b) = 128`

: only calculate the size of the reference machinery, it is afected by the view status..`getsizeof(b.base) = 1128`

: This calculate the size of the numerical elements plus the reference machinery independently of the view status.

**Summarizing**:

If you want to know the size of the numerical elements use `array.nbytes`

and it will work independently of whether there is a view or not. If you, on the other hand, want the size of the numerical elements plus the whole reference machinery you want to use `getsizeof(array.base)`

to get reliable estimates independent of your view status.