Deserializing Prometheus `remote_write` Protobuf output in Python

Question:

I’m experimenting (for the first time) with Prometheus. I’ve setup Prometheus to send messages to a local flask server:

remote_write:
  - url: "http://localhost:5000/metric"

I’m able to read the incoming bytes, however, I’m not able to convert the incoming messages to any meaningful data.

I’m very new to Prometheus (and Protobuf!) so I’m not sure what the best approach is. I would rather not use a third party package, but want to learn and understand the Protobuf de/serialization myself.

I tried copying the metrics.proto definitions from the Prometheus GitHub and compiling them with protoc. I tried importing the metrics_pb2.py file and parsing the incoming message:

read_metric = metrics_pb2.Metric()
read_metric.ParseFromString(request.data)

I also tried using the remote.proto definitions (specifically WriteRequest) which also didn’t work:

read_metric = remote_pb2.WriteRequest()
read_metric.ParseFromString(request.data)

This results in:
google.protobuf.message.DecodeError: Error parsing message

So I suspect that I’m using the wrong Protobuf definitions?

I would really appreciate any help & advice on this!

To provide some more context for what I’m attempting to accomplish:

I’m trying to stream data from multiple Prometheus instances to a message queue so they can be passed to a machine learning model.
I’m using online training with an active learning model, and I want the data to be (near) real-time. That’s why I thought the remote_write functionality is the best approach rather than continuously scraping each instance. If you have any other ideas on how I can build this system, feel free to share – I’ve just been playing around with it for a couple days, so I’m open to any feedback!

ANSWER EDIT:

I had to first decompress the data using snappy, thanks larsks!:

bytes = request.data
decompressed = snappy.uncompress(bytes)

read_metric = remote_pb2.WriteRequest()
read_metric.ParseFromString(decompressed)
Asked By: picklepick

||

Answers:

The remote.proto document is the correct protobuf specification. You may find this document useful, which explicitly defines the remote write protocol. That document includes the "official" protobuf specification, and mentions that:

The remote write request MUST be encoded using Google Protobuf 3, and MUST use the schema defined above. Note the Prometheus implementation uses gogoproto optimisations – for receivers written in languages other than Golang the gogoproto types MAY be substituted for line-level equivalents.

The document also notes that the body of remote write requests is compressed:

The remote write request in the body of the HTTP POST MUST be compressed with Google’s Snappy. The block format MUST be used – the framed format MUST NOT be used.

So before you can parse the request body you’ll need to find a Python solution for decompressing snappy-compressed data.

(I found a link to that google doc from this article that talks about the development of the remote write protocol.)

Answered By: larsks