python json load set encoding to utf-8

Question:

I have this code:

keys_file = open("keys.json")
keys = keys_file.read().encode('utf-8')
keys_json = json.loads(keys)
print(keys_json)

There are some none-english characters in keys.json.
But as a result I get:

[{'category': 'мбт', 'keys': ['Блендер Philips',
'мультиварка Polaris']}, {'category': 'КБТ', 'keys':
['холод ильник атлант', 'посудомоечная
машина Bosch']}]

what do I do?

Asked By: user2950593

||

Answers:

encode means characters to binary. What you want when reading a file is binary to charactersdecode. But really this entire process is way too manual, simply do this:

with open('keys.json', encoding='utf-8') as fh:
    data = json.load(fh)

print(data)

with handles the correct opening and closing of the file, the encoding argument to open ensures the file is read using the correct encoding, and the load call reads directly from the file handle instead of storing a copy of the file contents in memory first.

If this still outputs invalid characters, it means your source encoding isn’t UTF-8 or your console/terminal doesn’t handle UTF-8.

Answered By: deceze
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.