How to encode json dump correctly in python
Question:
I have a python 3 script that should get some data from a .csv-file and write it to a json file.
During my processing the encoding is correct, so that german umlauts ü, ä or degree sign ° are like they are (# coding=cp1252 at the head).
But when I write the dict via json.dump() the encoding is gone…
How can I write a dict to a json file with the correct encoding?
# -*- coding: cp1252 -*-
import json
from pandas import read_csv
x={"äö": "ü°"}
print(x, json.dumps(x, indent=4))
>>>> {'äö': 'ü°'} {"u00e4u00f6": "u00fcu00b0"}
Answers:
This is happening because ä
, ö
, ü
, and °
aren’t ASCII characters.
json.dumps
has an optional argument called ensure_ascii
which escapes non-ASCII characters, and it’s set to True
by default. You can get your desired behavior by setting this to false.
x={"äö": "ü°"}
print(x, json.dumps(x, ensure_ascii=False, indent=4))
I have a python 3 script that should get some data from a .csv-file and write it to a json file.
During my processing the encoding is correct, so that german umlauts ü, ä or degree sign ° are like they are (# coding=cp1252 at the head).
But when I write the dict via json.dump() the encoding is gone…
How can I write a dict to a json file with the correct encoding?
# -*- coding: cp1252 -*-
import json
from pandas import read_csv
x={"äö": "ü°"}
print(x, json.dumps(x, indent=4))
>>>> {'äö': 'ü°'} {"u00e4u00f6": "u00fcu00b0"}
This is happening because ä
, ö
, ü
, and °
aren’t ASCII characters.
json.dumps
has an optional argument called ensure_ascii
which escapes non-ASCII characters, and it’s set to True
by default. You can get your desired behavior by setting this to false.
x={"äö": "ü°"}
print(x, json.dumps(x, ensure_ascii=False, indent=4))