Python Kafka consumer message deserialisation using AVRO, without schema registry – problem
Question:
I have a problem with Kafka message deserializing. I use confluent kafka.
There is no schema registry – schemas are hardcoded.
I can connect consumer to any topic and receive messages, but I can’t deserialise these messages.
Output after deserialisation looks something like this:
print(reader) line:
<avro.io.DatumReader object at 0x000002354235DBB0>
I think, that I’ve wrong code for deserializaing, but hove to solve this problem?
At the end I want to extract deserialized key and value
from confluent_kafka import Consumer, KafkaException, KafkaError
import sys
import time
import avro.schema
from avro.io import DatumReader, DatumWriter
def kafka_conf():
conf = {''' MY CONFIGURATION'''
}
return conf
if __name__ == '__main__':
conf = kafka_conf()
topic = """MY TOPIC"""
c = Consumer(conf)
c.subscribe([topic])
try:
while True:
msg = c.poll(timeout=200.0)
if msg is None:
continue
if msg.error():
# Error or event
if msg.error().code() == KafkaError._PARTITION_EOF:
# End of partition event
sys.stderr.write('%% %s [%d] reached end at offset %dn' %
(msg.topic(), msg.partition(), msg.offset()))
else:
# Error
raise KafkaException(msg.error())
else:
print("key: ", msg.key())
print("value: ", msg.value())
print("offset: ", msg.offset())
print("topic: ", msg.topic())
print("timestamp: ", msg.timestamp())
print("headers: ", msg.headers())
print("partition: ", msg.partition())
print("latency: ", msg.latency())
schema = avro.schema.parse(open("MY_AVRO_SCHEMA.avsc", "rb").read())
print(schema)
reader = DatumReader(msg.value, reader_schema=schema)
print(reader)
time.sleep(5) # only on test
except KeyboardInterrupt:
print('nAborted by usern')
finally:
c.close()
Answers:
You’re printing a reader object, not deserializing data, which you do with reader.read()
You need a BinaryDecoder
as well.
The DeserializingConsumer in the Confluent library source code does the exact same thing, after it fetches the schema from the registry, rather than local filesystem, so I suggest you follow what they do.
I have a problem with Kafka message deserializing. I use confluent kafka.
There is no schema registry – schemas are hardcoded.
I can connect consumer to any topic and receive messages, but I can’t deserialise these messages.
Output after deserialisation looks something like this:
print(reader) line:
<avro.io.DatumReader object at 0x000002354235DBB0>
I think, that I’ve wrong code for deserializaing, but hove to solve this problem?
At the end I want to extract deserialized key and value
from confluent_kafka import Consumer, KafkaException, KafkaError
import sys
import time
import avro.schema
from avro.io import DatumReader, DatumWriter
def kafka_conf():
conf = {''' MY CONFIGURATION'''
}
return conf
if __name__ == '__main__':
conf = kafka_conf()
topic = """MY TOPIC"""
c = Consumer(conf)
c.subscribe([topic])
try:
while True:
msg = c.poll(timeout=200.0)
if msg is None:
continue
if msg.error():
# Error or event
if msg.error().code() == KafkaError._PARTITION_EOF:
# End of partition event
sys.stderr.write('%% %s [%d] reached end at offset %dn' %
(msg.topic(), msg.partition(), msg.offset()))
else:
# Error
raise KafkaException(msg.error())
else:
print("key: ", msg.key())
print("value: ", msg.value())
print("offset: ", msg.offset())
print("topic: ", msg.topic())
print("timestamp: ", msg.timestamp())
print("headers: ", msg.headers())
print("partition: ", msg.partition())
print("latency: ", msg.latency())
schema = avro.schema.parse(open("MY_AVRO_SCHEMA.avsc", "rb").read())
print(schema)
reader = DatumReader(msg.value, reader_schema=schema)
print(reader)
time.sleep(5) # only on test
except KeyboardInterrupt:
print('nAborted by usern')
finally:
c.close()
You’re printing a reader object, not deserializing data, which you do with reader.read()
You need a BinaryDecoder
as well.
The DeserializingConsumer in the Confluent library source code does the exact same thing, after it fetches the schema from the registry, rather than local filesystem, so I suggest you follow what they do.