NumPy save some arrays at once
Question:
I working on different shapes of arrays and I want to save them all with numpy.save
, so, consider I have
mat1 = numpy.arange(8).reshape(4, 2)
mat2 = numpy.arange(9).reshape(2, 3)
numpy.save('mat.npy', numpy.array([mat1, mat2]))
It works. But when I have two matrices with one dimension of same size it’s not working.
mat1 = numpy.arange(8).reshape(2, 4)
mat2 = numpy.arange(10).reshape(2, 5)
numpy.save('mat.npy', numpy.array([mat1, mat2]))
It causes
Traceback (most recent call last):
File "<input>", line 1, in <module>
ValueError: could not broadcast input array from shape (2,4) into shape (2)
And note that the problem caused by numpy.array([mat1, mat2])
and not by numpy.save
I know that such array is possible:
>> numpy.array([[[1, 2]], [[1, 2], [3, 4]]])
array([[[1, 2]], [[1, 2], [3, 4]]], dtype=object)
So, all of what I want is to save two arrays as mat1
and mat2
at once.
Answers:
If you’d like to save multiple arrays in the same format as np.save
, use np.savez
.
For example:
import numpy as np
arr1 = np.arange(8).reshape(2, 4)
arr2 = np.arange(10).reshape(2, 5)
np.savez('mat.npz', name1=arr1, name2=arr2)
data = np.load('mat.npz')
print data['name1']
print data['name2']
If you have several arrays, you can expand the arguments:
import numpy as np
data = [np.arange(8).reshape(2, 4), np.arange(10).reshape(2, 5)]
np.savez('mat.npz', *data)
container = np.load('mat.npz')
data = [container[key] for key in container]
Note that the order is not preserved. If you do need to preserve order, you might consider using pickle
instead.
If you use pickle
, be sure to specify the binary protocol, otherwise the you’ll write things using ascii pickle, which is particularly inefficient for numpy arrays. With a binary protocol, ndarray
s more or less pickle to the same format as np.save
/np.savez
. For example:
# Note: This is Python2.x specific. It's identical except for the import on 3.x
import cPickle as pickle
import numpy as np
data = [np.arange(8).reshape(2, 4), np.arange(10).reshape(2, 5)]
with open('mat.pkl', 'wb') as outfile:
pickle.dump(data, outfile, pickle.HIGHEST_PROTOCOL)
with open('mat.pkl', 'rb') as infile:
result = pickle.load(infile)
In this case, result
and data
will have identical contents and the order of the input list of arrays will be preserved.
Small addition: if you’d like to use numpy.savez()
and preserve names associated with the saved arrays (instead of arr_0, arr_1, ...
) you can pass a dictionary as **kwargs
using the double-star operator.
d = {}
d['a'] = np.random.randint(10, size=5)
d['b'] = np.random.randint(10, size=5)
print(d)
# {'a': array([8, 9, 5, 0, 0]), 'b': array([1, 7, 6, 9, 2])}
np.savez("test", **d)
container = np.load("test.npz")
e = {name: container[name] for name in container}
print(e)
# {'a': array([8, 9, 5, 0, 0]), 'b': array([1, 7, 6, 9, 2])}
I working on different shapes of arrays and I want to save them all with numpy.save
, so, consider I have
mat1 = numpy.arange(8).reshape(4, 2)
mat2 = numpy.arange(9).reshape(2, 3)
numpy.save('mat.npy', numpy.array([mat1, mat2]))
It works. But when I have two matrices with one dimension of same size it’s not working.
mat1 = numpy.arange(8).reshape(2, 4)
mat2 = numpy.arange(10).reshape(2, 5)
numpy.save('mat.npy', numpy.array([mat1, mat2]))
It causes
Traceback (most recent call last):
File "<input>", line 1, in <module>
ValueError: could not broadcast input array from shape (2,4) into shape (2)
And note that the problem caused by numpy.array([mat1, mat2])
and not by numpy.save
I know that such array is possible:
>> numpy.array([[[1, 2]], [[1, 2], [3, 4]]])
array([[[1, 2]], [[1, 2], [3, 4]]], dtype=object)
So, all of what I want is to save two arrays as mat1
and mat2
at once.
If you’d like to save multiple arrays in the same format as np.save
, use np.savez
.
For example:
import numpy as np
arr1 = np.arange(8).reshape(2, 4)
arr2 = np.arange(10).reshape(2, 5)
np.savez('mat.npz', name1=arr1, name2=arr2)
data = np.load('mat.npz')
print data['name1']
print data['name2']
If you have several arrays, you can expand the arguments:
import numpy as np
data = [np.arange(8).reshape(2, 4), np.arange(10).reshape(2, 5)]
np.savez('mat.npz', *data)
container = np.load('mat.npz')
data = [container[key] for key in container]
Note that the order is not preserved. If you do need to preserve order, you might consider using pickle
instead.
If you use pickle
, be sure to specify the binary protocol, otherwise the you’ll write things using ascii pickle, which is particularly inefficient for numpy arrays. With a binary protocol, ndarray
s more or less pickle to the same format as np.save
/np.savez
. For example:
# Note: This is Python2.x specific. It's identical except for the import on 3.x
import cPickle as pickle
import numpy as np
data = [np.arange(8).reshape(2, 4), np.arange(10).reshape(2, 5)]
with open('mat.pkl', 'wb') as outfile:
pickle.dump(data, outfile, pickle.HIGHEST_PROTOCOL)
with open('mat.pkl', 'rb') as infile:
result = pickle.load(infile)
In this case, result
and data
will have identical contents and the order of the input list of arrays will be preserved.
Small addition: if you’d like to use numpy.savez()
and preserve names associated with the saved arrays (instead of arr_0, arr_1, ...
) you can pass a dictionary as **kwargs
using the double-star operator.
d = {}
d['a'] = np.random.randint(10, size=5)
d['b'] = np.random.randint(10, size=5)
print(d)
# {'a': array([8, 9, 5, 0, 0]), 'b': array([1, 7, 6, 9, 2])}
np.savez("test", **d)
container = np.load("test.npz")
e = {name: container[name] for name in container}
print(e)
# {'a': array([8, 9, 5, 0, 0]), 'b': array([1, 7, 6, 9, 2])}