Convert byte array back to numpy array
Question:
You can convert a numpy array to bytes using .tobytes()
function.
How do decode it back from this bytes array to numpy array?
I tried like this for array i of shape (28,28)
>>k=i.tobytes()
>>np.frombuffer(k)==i
False
also tried with uint8 as well.
Answers:
A couple of issues with what you’re doing:

frombuffer
will always interpret the input as a 1dimensional array. It’s the first line of the documentation. So you’d have to reshape to be(28, 28)
. 
The default
dtype
isfloat
. So if you didn’t serialize out floats, then you’ll have to specify thedtype
manually (a priori no one can tell what a stream of bytes means: you have to say what they represent). 
If you want to make sure the arrays are equal, you have to use
np.array_equal
. Using==
will do an elementwise operation, and return anumpy
array of bools (this presumably isn’t what you want).
How do decode it back from this bytes array to numpy array?
Example:
In [3]: i = np.arange(28*28).reshape(28, 28)
In [4]: k = i.tobytes()
In [5]: y = np.frombuffer(k, dtype=i.dtype)
In [6]: y.shape
Out[6]: (784,)
In [7]: np.array_equal(y.reshape(28, 28), i)
Out[7]: True
HTH.
While you could use tobytes()
, it isn’t the ideal method as it doesn’t store shape information of the numpy array.
In cases where you have to send it to another process where you have no information about the shape, you will have to send the shape information explicitly.
A more elegant solution would be saving it to a BytesIO buffer using np.save
and recovering using np.load
. In this, you don’t need to specifically store shape information anywhere and can easily recover your numpy array from the byte value.
Example:
>>> import numpy as np
>>> from io import BytesIO
>>> x = np.arange(28*28).reshape(28, 28)
>>> x.shape
(28, 28)
# save in to BytesIo buffer
>>> np_bytes = BytesIO()
>>> np.save(np_bytes, x, allow_pickle=True)
# get bytes value
>>> np_bytes = np_bytes.getvalue()
>>> type(np_bytes)
<class 'bytes'>
# load from bytes into numpy array
>>> load_bytes = BytesIO(np_bytes)
>>> loaded_np = np.load(load_bytes, allow_pickle=True)
# shape is preserved
>>> loaded_np.shape
(28, 28)
# both arrays are equal without sending shape
>>> np.array_equal(x,loaded_np)
True
For your convenience, here is a serialization/deserialization function implementing Saket Kumar’s answer.
from io import BytesIO
import numpy as np
def array_to_bytes(x: np.ndarray) > bytes:
np_bytes = BytesIO()
np.save(np_bytes, x, allow_pickle=True)
return np_bytes.getvalue()
def bytes_to_array(b: bytes) > np.ndarray:
np_bytes = BytesIO(b)
return np.load(np_bytes, allow_pickle=True)
# 
# quick test
def test():
x = np.random.uniform(0, 155, (2, 3)).astype(np.float16)
b = array_to_bytes(x)
x1 = bytes_to_array(b)
assert np.all(x == x1)
if __name__ == '__main__':
test()
If you just need the binarized array, not restricted to the np.tobytes
method, you can use pickle.dumps
and pickle.loads
Here goes an example
import pickle
A = np.random.randint(0, 10, [2,2])
A_bytes=pickle.dumps(A, protocol=0)
A_restore=pickle.loads(A_bytes)
# test byte type and restored np mat
np.testing.assert_array_equal(A_restore, A)
assert type(A_bytes)==bytes