python – 'utf-8' codec can't decode bytes in position 0-2: invalid continuation byte
Question:
I’m trying to decode the below from an aws kinesis data stream using aws lambda, but I keep getting a "’utf-8’** codec can’t decode bytes in position 0-2: invalid continuation byte" error
x = b’xf3x89x9axc2n$dad568a5-6305-481c-b6f1-f8338cc127dfn$3d57f33a-d681-467b-bb82-89c0d77e2621n$3ade7757-3df4-41ec-bdc8-52a27449c420n$a0a59a4e-02f5-462d-8c3e-50030145cf17x1ax83x01x08x00x1ax7f{ "window_start": "2022-12-30 13:25:00","window_end": "2022-12-30 13:35:00","player_id": 2004,"bonus_stake": 2.76,"bonus_win": 4}x1ax86x01x08x01x1ax81x01{"window_start": "2022-12-30 13:25:00","window_end": "2022-12-30 13:35:00","player_id": 2304,"bonus_stake": 2.2,"bonus_win": 2.21}x1ax87x01x08x02x1ax82x01{"window_start": "2022-12-30 13:25:00","window_end": "2022-12-30 13:35:00","player_id": 2290,"bonus_stake": 11.1,"bonus_win": 38.7}x1ax86x01x08x03x1ax81x01{"window_start": "2022-12-30 13:25:00","window_end": "2022-12-30 13:35:00","player_id": 2192,"bonus_stake": 1.32,"bonus_win": 0.6}x10xa6x1atBxa5x9bx14xa5?xadxcdx8bxe8^xcb’
s = x.decode()
print(s)
Answers:
The reason the decode of raw data from output stream fails is because aggregation is enabled by default when writing into Kinesis stream. You can set the following on your table
‘sink.producer.aggregation-enabled’ = ‘false’
I’m trying to decode the below from an aws kinesis data stream using aws lambda, but I keep getting a "’utf-8’** codec can’t decode bytes in position 0-2: invalid continuation byte" error
x = b’xf3x89x9axc2n$dad568a5-6305-481c-b6f1-f8338cc127dfn$3d57f33a-d681-467b-bb82-89c0d77e2621n$3ade7757-3df4-41ec-bdc8-52a27449c420n$a0a59a4e-02f5-462d-8c3e-50030145cf17x1ax83x01x08x00x1ax7f{ "window_start": "2022-12-30 13:25:00","window_end": "2022-12-30 13:35:00","player_id": 2004,"bonus_stake": 2.76,"bonus_win": 4}x1ax86x01x08x01x1ax81x01{"window_start": "2022-12-30 13:25:00","window_end": "2022-12-30 13:35:00","player_id": 2304,"bonus_stake": 2.2,"bonus_win": 2.21}x1ax87x01x08x02x1ax82x01{"window_start": "2022-12-30 13:25:00","window_end": "2022-12-30 13:35:00","player_id": 2290,"bonus_stake": 11.1,"bonus_win": 38.7}x1ax86x01x08x03x1ax81x01{"window_start": "2022-12-30 13:25:00","window_end": "2022-12-30 13:35:00","player_id": 2192,"bonus_stake": 1.32,"bonus_win": 0.6}x10xa6x1atBxa5x9bx14xa5?xadxcdx8bxe8^xcb’
s = x.decode()
print(s)
The reason the decode of raw data from output stream fails is because aggregation is enabled by default when writing into Kinesis stream. You can set the following on your table
‘sink.producer.aggregation-enabled’ = ‘false’