Faster way to pass a numpy array through a protobuf message

Question:

I have a 921000 x 3 numpy array (921k 3D points, one point per row) that I am trying to pack into a protobuf message and I am running into performance issues. I have control over the protocol and can change it as needed. I am using Python 3.10 and numpy 1.26.1. I am using protocol buffers because I’m using gRPC.

For the very first unoptimized attempt I was using the following message structure:

message Point {
    float x = 1;
    float y = 2;
    float z = 3;
}

message BodyData {
    int32 id = 1;
    repeated Point points = 2;
}

And packing the points in one at a time (let data be the large numpy array):

body = BodyData()
for row in data:
    body.points.append(Point(x=row[0], y=row[1], z=row[2]))

This takes approximately 1.6 seconds, which is way too slow.

For the next attempt I ditched the Point structure and decided to transmit the points as a flat array of X/Y/Z triplets:

message Points {
    repeated float xyz = 1;
}

message BodyData {
    int32 id = 1;
    Points points = 2;
}

I did some performance tests to determine the fastest way to append a 2D numpy array to a list, and got the following results:

# Time: 80.1ms
points = []
points.extend(data.flatten())

# Time: 96.8ms
points = []
points.extend(data.reshape((data.shape[0] * data.shape[1],)))

# Time: 76.5ms - FASTEST
points = []
points.extend(data.flatten().tolist())

From this I determined that .extend(data.flatten().tolist()) was the fastest.

However, when I applied this to the protobuf message, it slowed way down:

# Time: 436.0ms
body = BodyData()
body.points.xyz.extend(data.flatten().tolist())

So the fastest I’ve been able to figure out how to pack the numpy array into any protobuf message is 436ms for 921000 points.

This is very far short of my performance target, which is ~12ms per copy. I’m not sure if I can get close to that but, is there any way I can do this more quickly?

Asked By: Jason C

||

Answers:

If your goal is just to send something over gRPC to another program that you control, then you don’t actually have to convert everything into "native" protobuf messages; you can use a protobuf bytes field to store another serialization format, such as numpy tobytes() output, or Arrow. This will be much faster.

Answered By: hobbs