Python: Send list of multidimensional numPy arrays over socket

Question:

i wish to send a list consisting of multi-dimensional NumPy arrays over the socket to my server and restore its format right after. The List of arrays (variable aggregated_ndarrays) looks like the following:

[array([[[[-1.04182057e-01,  9.81570184e-02,  8.69736895e-02,
          -6.61955923e-02, -4.51700203e-02],
         [ 5.26290983e-02, -1.18473642e-01,  2.64136307e-02,
          -9.26332623e-02, -6.63961545e-02],
         [-8.80082026e-02,  7.90973455e-02, -1.13944486e-02,
          -1.51292123e-02,  7.65037686e-02],
         [-9.15177837e-02,  7.08795676e-04, -1.08281896e-03,
           8.65678713e-02,  6.68114647e-02],
         [-8.45356733e-02, -6.90313280e-02, -5.81113175e-02,
          -1.14920050e-01, -4.11906727e-02]],

                        ...           
    
         3.35839503e-02,  6.30911887e-02,  4.10411768e-02,
        -3.64055522e-02, -3.56383622e-02,  9.80690420e-02,
         8.15757737e-02, -1.00057133e-01,  1.16158882e-02,
        -9.82330441e-02,  9.00610462e-02, -1.01473713e-02,
        -2.64037345e-02,  1.37711661e-02,  6.63968623e-02]], dtype=float32), array([-0.02089943, -0.0020895     , -0.00506333,  0.03931976,  0.04795408,
       -0.01520141, -0.03287903,  0.0037387 ,  0.01339047, -0.0576841 ],
      dtype=float32)]`

this is the client:

import socket

# deserialize the weights, so they can be processed more easily
            # Convert `Parameters` to `List[np.ndarray]`
            aggregated_ndarrays: List[np.ndarray] = parameters_to_ndarrays(aggregated_parameters)
            # send the aggregated weights to the central server together with the number of training-   examples
            print("Attempting server connection")
            conn = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
            conn.connect(("127.0.0.1", 8088))
            conn.send(str((aggregated_ndarrays,n_examples_fit)).encode())

and the server:

sock.bind(("127.0.0.1", 8088))
    sock.listen()
    print("Created server socket and listening %s" % sock)
    conn, addr = sock.accept()
    print("Accepted client connection")
    weights, n1 = conn.recv(999_999).decode().rsplit(" ", 1)

i previously tried to send the data over the socket with json.dumps but im getting the error TypeError: Object of type ndarray is not JSON serializable .When sending the data as encoded bytes and trying to send it to the server side the received data is just a plain decoded string instead of a list of multi-dimensional NumPy arrays.

I am using python 3.10.

Asked By: kyro121

||

Answers:

You could use pickle, which is a libary for storing variables in bytes of data. You can use this libary like this:

import pickle

# dump an object
dumped_object = pickle.dumps(obj)
# load an object from the dumped data
obj = pickle.loads(dumped_object)

Modification of your code:

# Client
import pickle
# deserialize the weights, so they can be processed more easily
# Convert `Parameters` to `List[np.ndarray]`
aggregated_ndarrays: List[np.ndarray] = parameters_to_ndarrays(aggregated_parameters)
# send the aggregated weights to the central server together with the number of training-   examples
print("Attempting server connection")
conn = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
conn.connect(("127.0.0.1", 8088))
conn.send(pickle.dumps((aggregated_ndarrays, n_examples_fit)))

# Server
import pickle
sock.bind(("127.0.0.1", 8088))
sock.listen()
print("Created server socket and listening %s" % sock)
conn, addr = sock.accept()
print("Accepted client connection")
weights, n1 = pickle.loads(conn.recv(999_999))
Answered By: Elias

numpy has a .tobytes() method which will convert a numpy array into a bytes object that can be transmitted. It has a .frombuffer() method to convert back to a numpy array, but it will be a single dimension and default to float32. Other data must be sent to reconstruct the original data type and shape or the array.

TCP is not a message-based protocol, so you cannot simply send the bytes and expect to receive them as a complete message in one recv() call. You must design a byte stream that has the information needed to determine a complete message has been received, and buffer received data until a complete message can be extracted.

socket.makefile() is a method that will buffer data and has the file-like methods readline and read. The former reads newline-terminated data, and the latter reads a fixed number of bytes. Both may return less data if the socket is closed.

Below is a simple protocol that uses a single newline-terminated line of JSON as a header with the metadata needed to reconstruct a numpy array and socket.makefile to read the header line and byte data and extract the numpy array:

server.py

import json
import numpy as np
import socket

with socket.socket() as s:
    s.bind(('localhost', 5000))
    s.listen()
    while True:
        client, addr = s.accept()
        print(f'{addr}: connected')
        with client, client.makefile('rb') as rfile:
            while True:
                header = rfile.readline()
                if not header: break
                metadata = json.loads(header)
                print(f'{addr}: {metadata}')
                serial_data = rfile.read(metadata['length'])
                data = np.frombuffer(serial_data, dtype=metadata['type']).reshape(metadata['shape'])
                print(data)
        print(f'{addr}: disconnected')

client.py

import json
import numpy as np
import socket

def transmit(sock, data):
    serial_data = data.tobytes()
    metadata = {'type': data.dtype.name,
                'shape': data.shape,
                'length': len(serial_data)}
    sock.sendall(json.dumps(metadata).encode() + b'n')
    sock.sendall(serial_data)

with socket.socket() as s:
    s.connect(('localhost', 5000))
    data = np.array([[1,2,3],[4,5,6],[7,8,9]], dtype=np.float32)
    transmit(s, data)
    data = np.array([[[1,2],[3,4]],[[5,6],[7,8]]], dtype=np.int16)
    transmit(s, data)

Output:

('127.0.0.1', 3385): connected
('127.0.0.1', 3385): {'type': 'float32', 'shape': [3, 3], 'length': 36}
[[1. 2. 3.]
 [4. 5. 6.]
 [7. 8. 9.]]
('127.0.0.1', 3385): {'type': 'int16', 'shape': [2, 2, 2], 'length': 16}
[[[1 2]
  [3 4]]

 [[5 6]
  [7 8]]]
('127.0.0.1', 3385): disconnected

pickle can be used for serialization as well. It has the advantage that metadata is built-in and it works nicely with the file-like stream created by socket.makefile. The disadvantage is that it isn’t secure and a malicious client can take advantage of that.

server.py

import pickle
import numpy as np
import socket

with socket.socket() as s:
    s.bind(('localhost', 5000))
    s.listen()
    while True:
        client, addr = s.accept()
        print(f'{addr}: connected')
        with client, client.makefile('rb') as rfile:
            while True:
                try:
                    data = pickle.load(rfile)
                except EOFError:  # Throws exception if incomplete or socket closed
                    break
                print(data)
        print(f'{addr}: disconnected')

client.py

import pickle
import numpy as np
import socket

def transmit(sock, data):
    serial_data = pickle.dumps(data)
    sock.sendall(serial_data)

with socket.socket() as s:
    s.connect(('localhost', 5000))
    data = np.array([[1,2,3],[4,5,6],[7,8,9]], dtype=np.float32)
    transmit(s, data)
    data = np.array([[[1,2],[3,4]],[[5,6],[7,8]]], dtype=np.int16)
    transmit(s, data)

Output:

('127.0.0.1', 3578): connected
[[1. 2. 3.]
 [4. 5. 6.]
 [7. 8. 9.]]
[[[1 2]
  [3 4]]

 [[5 6]
  [7 8]]]
('127.0.0.1', 3578): disconnected
Answered By: Mark Tolonen