Convert byte array back to numpy array

Question:

You can convert a numpy array to bytes using .tobytes() function.

How do decode it back from this bytes array to numpy array?
I tried like this for array i of shape (28,28)

>>k=i.tobytes()

>>np.frombuffer(k)==i

False

also tried with uint8 as well.

Asked By: Gautham Santhosh

||

Answers:

A couple of issues with what you’re doing:

  1. frombuffer will always interpret the input as a 1-dimensional array. It’s the first line of the documentation. So you’d have to reshape to be (28, 28).

  2. The default dtype is float. So if you didn’t serialize out floats, then you’ll have to specify the dtype manually (a priori no one can tell what a stream of bytes means: you have to say what they represent).

  3. If you want to make sure the arrays are equal, you have to use np.array_equal. Using == will do an elementwise operation, and return a numpy array of bools (this presumably isn’t what you want).

How do decode it back from this bytes array to numpy array?

Example:

In [3]: i = np.arange(28*28).reshape(28, 28)

In [4]: k = i.tobytes()

In [5]: y = np.frombuffer(k, dtype=i.dtype)

In [6]: y.shape
Out[6]: (784,)

In [7]: np.array_equal(y.reshape(28, 28), i)
Out[7]: True

HTH.

Answered By: Matt Messersmith

While you could use tobytes(), it isn’t the ideal method as it doesn’t store shape information of the numpy array.

In cases where you have to send it to another process where you have no information about the shape, you will have to send the shape information explicitly.

A more elegant solution would be saving it to a BytesIO buffer using np.save and recovering using np.load. In this, you don’t need to specifically store shape information anywhere and can easily recover your numpy array from the byte value.

Example:

>>> import numpy as np
>>> from io import BytesIO

>>> x = np.arange(28*28).reshape(28, 28)
>>> x.shape
(28, 28)

# save in to BytesIo buffer 
>>> np_bytes = BytesIO()
>>> np.save(np_bytes, x, allow_pickle=True)

# get bytes value
>>> np_bytes = np_bytes.getvalue()
>>> type(np_bytes)
<class 'bytes'>

# load from bytes into numpy array
>>> load_bytes = BytesIO(np_bytes)
>>> loaded_np = np.load(load_bytes, allow_pickle=True)

# shape is preserved
>>> loaded_np.shape
(28, 28)

# both arrays are equal without sending shape
>>> np.array_equal(x,loaded_np)
True
Answered By: Saket Kumar Singh

For your convenience, here is a serialization/deserialization function implementing Saket Kumar’s answer.

from io import BytesIO
import numpy as np

def array_to_bytes(x: np.ndarray) -> bytes:
    np_bytes = BytesIO()
    np.save(np_bytes, x, allow_pickle=True)
    return np_bytes.getvalue()


def bytes_to_array(b: bytes) -> np.ndarray:
    np_bytes = BytesIO(b)
    return np.load(np_bytes, allow_pickle=True)

# ----------
# quick test

def test():
    x = np.random.uniform(0, 155, (2, 3)).astype(np.float16)
    b = array_to_bytes(x)
    x1 = bytes_to_array(b)
    assert np.all(x == x1)


if __name__ == '__main__':
    test()

Answered By: Sam De Meyer

If you just need the binarized array, not restricted to the np.tobytes method, you can use pickle.dumps and pickle.loads

Here goes an example

import pickle
A = np.random.randint(0, 10, [2,2])
A_bytes=pickle.dumps(A, protocol=0)
A_restore=pickle.loads(A_bytes)

# test byte type and restored np mat
np.testing.assert_array_equal(A_restore, A)
assert type(A_bytes)==bytes

Answered By: M.Zhu
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.