NumPy save some arrays at once

Question:

I working on different shapes of arrays and I want to save them all with numpy.save, so, consider I have

mat1 = numpy.arange(8).reshape(4, 2)
mat2 = numpy.arange(9).reshape(2, 3)
numpy.save('mat.npy', numpy.array([mat1, mat2]))

It works. But when I have two matrices with one dimension of same size it’s not working.

mat1 = numpy.arange(8).reshape(2, 4)
mat2 = numpy.arange(10).reshape(2, 5)
numpy.save('mat.npy', numpy.array([mat1, mat2]))

It causes
Traceback (most recent call last):
File "<input>", line 1, in <module>
ValueError: could not broadcast input array from shape (2,4) into shape (2)

And note that the problem caused by numpy.array([mat1, mat2]) and not by numpy.save

I know that such array is possible:

>> numpy.array([[[1, 2]], [[1, 2], [3, 4]]])
array([[[1, 2]], [[1, 2], [3, 4]]], dtype=object)

So, all of what I want is to save two arrays as mat1 and mat2 at once.

Asked By: zardav

||

Answers:

If you’d like to save multiple arrays in the same format as np.save, use np.savez.

For example:

import numpy as np

arr1 = np.arange(8).reshape(2, 4)
arr2 = np.arange(10).reshape(2, 5)
np.savez('mat.npz', name1=arr1, name2=arr2)

data = np.load('mat.npz')
print data['name1']
print data['name2']

If you have several arrays, you can expand the arguments:

import numpy as np

data = [np.arange(8).reshape(2, 4), np.arange(10).reshape(2, 5)]
np.savez('mat.npz', *data)

container = np.load('mat.npz')
data = [container[key] for key in container]

Note that the order is not preserved. If you do need to preserve order, you might consider using pickle instead.

If you use pickle, be sure to specify the binary protocol, otherwise the you’ll write things using ascii pickle, which is particularly inefficient for numpy arrays. With a binary protocol, ndarrays more or less pickle to the same format as np.save/np.savez. For example:

# Note: This is Python2.x specific. It's identical except for the import on 3.x
import cPickle as pickle
import numpy as np

data = [np.arange(8).reshape(2, 4), np.arange(10).reshape(2, 5)]

with open('mat.pkl', 'wb') as outfile:
    pickle.dump(data, outfile, pickle.HIGHEST_PROTOCOL)

with open('mat.pkl', 'rb') as infile:
    result = pickle.load(infile)

In this case, result and data will have identical contents and the order of the input list of arrays will be preserved.

Answered By: Joe Kington

Small addition: if you’d like to use numpy.savez() and preserve names associated with the saved arrays (instead of arr_0, arr_1, ...) you can pass a dictionary as **kwargs using the double-star operator.

d = {}
d['a'] = np.random.randint(10, size=5)
d['b'] = np.random.randint(10, size=5)
print(d)
# {'a': array([8, 9, 5, 0, 0]), 'b': array([1, 7, 6, 9, 2])}

np.savez("test", **d)
container = np.load("test.npz")

e = {name: container[name] for name in container}
print(e)
# {'a': array([8, 9, 5, 0, 0]), 'b': array([1, 7, 6, 9, 2])}
Answered By: L_W
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.