How to dump a Python dictionary to JSON when keys are non-trivial objects?

Question:

import datetime, json
x = {'alpha': {datetime.date.today(): 'abcde'}}
print json.dumps(x)

The above code fails with a TypeError since keys of JSON objects need to be strings. The json.dumps function has a parameter called default that is called when the value of a JSON object raises a TypeError, but there seems to be no way to do this for the key. What is the most elegant way to work around this?

Asked By: ipartola

||

Answers:

You can extend json.JSONEncoder to create your own encoder which will be able to deal with datetime.datetime objects (or objects of any type you desire) in such a way that a string is created which can be reproduced as a new datetime.datetime instance. I believe it should be as simple as having json.JSONEncoder call repr() on your datetime.datetime instances.

The procedure on how to do so is described in the json module docs.

The json module checks the type of each value it needs to encode and by default it only knows how to handle dicts, lists, tuples, strs, unicode objects, int, long, float, boolean and none 🙂

Also of importance for you might be the skipkeys argument to the JSONEncoder.


After reading your comments I have concluded that there is no easy solution to have JSONEncoder encode the keys of dictionaries with a custom function. If you are interested you can look at the source and the methods iterencode() which calls _iterencode() which calls _iterencode_dict() which is where the type error gets raised.

Easiest for you would be to create a new dict with isoformatted keys like this:

import datetime, json

D = {datetime.datetime.now(): 'foo',
     datetime.datetime.now(): 'bar'}

new_D = {}

for k,v in D.iteritems():
  new_D[k.isoformat()] = v

json.dumps(new_D)

Which returns ‘{“2010-09-15T23:24:36.169710”: “foo”, “2010-09-15T23:24:36.169723”: “bar”}’. For niceties, wrap it in a function 🙂

Answered By: supakeen

http://jsonpickle.github.io/ might be what you want. When facing a similar issue, I ended up doing:

to_save = jsonpickle.encode(THE_THING, unpicklable=False, max_depth=4, make_refs=False)
Answered By: seanp2k

you can do
x = {'alpha': {datetime.date.today().strftime('%d-%m-%Y'): 'abcde'}}

Answered By: 108

If you really need to do it, you can monkeypatch json.encoder:

from _json import encode_basestring_ascii  # used when ensure_ascii=True (which is the default where you want everything to be ascii)
from _json import encode_basestring  # used in any other case

def _patched_encode_basestring(o):
    """
    Monkey-patching Python's json serializer so it can serialize keys that are not string!
    You can monkey patch the ascii one the same way.
    """
    if isinstance(o, MyClass):
        return my_serialize(o)
    return encode_basestring(o)


json.encoder.encode_basestring = _patched_encode_basestring
Answered By: Seperman

JSON only accepts the here mentioned data types for encoding. As @supakeen mentioned, you can extend the JSONEncoder class in order to encode any values inside a dictionary but no keys! If you want to encode keys, you have to do it on your own.

I used a recursive function in order to encode tuple-keys as strings and recover them later.

Here an example:

def _tuple_to_string(obj: Any) -> Any:
"""Serialize tuple-keys to string representation. A tuple wil be obtain a leading '__tuple__' string and decomposed in list representation.

Args:
    obj (Any): Typically a dict, tuple, list, int, or string.

Returns:
    Any: Input object with serialized tuples.
"""
# deep copy object to avoid manipulation during iteration
obj_copy = copy.deepcopy(obj)
# if the object is a dictionary
if isinstance(obj, dict):
    # iterate over every key
    for key in obj:
        # set for later to avoid modification in later iterations when this var does not get overwritten
        serialized_key = None
        # if key is tuple
        if isinstance(key, tuple):
            # stringify the key
            serialized_key = f"__tuple__{list(key)}"
            # replace old key with encoded key
            obj_copy[serialized_key] = obj_copy.pop(key)
        # if the key was modified
        if serialized_key is not None:
            # do it again for the next nested dictionary
            obj_copy[serialized_key] = _tuple_to_string(obj[key])
        # else, just do it for the next dictionary
        else:
            obj_copy[key] = _tuple_to_string(obj[key])
return obj_copy

This will turn a tuple of the form ("blah", "blub") to "__tuple__["blah", "blub"]" so that you can dump it using json.dumps() or json.dump(). You can use the leading "__tuple"__ to detect them during decoding. Therefore, I used this function:

def _string_to_tuple(obj: Any) -> Any:
"""Convert serialized tuples back to original representation. Tuples need to have a leading "__tuple__" string.

Args:
    obj (Any): Typically a dict, tuple, list, int, or string.

Returns:
    Any: Input object with recovered tuples.
"""
# deep copy object to avoid manipulation during iteration
obj_copy = copy.deepcopy(obj)
# if the object is a dictionary
if isinstance(obj, dict):
    # iterate over every key
    for key in obj:
        # set for later to avoid modification in later iterations when this var does not get overwritten
        serialized_key = None
        # if key is a serialized tuple starting with the "__tuple__" affix
        if isinstance(key, str) and key.startswith("__tuple__"):
            # decode it so tuple
            serialized_key = tuple(key.split("__tuple__")[1].strip("[]").replace("'", "").split(", "))
            # if key is number in string representation
            if all(entry.isdigit() for entry in serialized_key):
                # convert to integer
                serialized_key = tuple(map(int, serialized_key))
            # replace old key with encoded key
            obj_copy[serialized_key] = obj_copy.pop(key)
        # if the key was modified
        if serialized_key is not None:
            # do it again for the next nested dictionary
            obj_copy[serialized_key] = _string_to_tuple(obj[key])
        # else, just do it for the next dictionary
        else:
            obj_copy[key] = _string_to_tuple(obj[key])
# if another instance was found
elif isinstance(obj, list):
    for item in obj:
        _string_to_tuple(item)
return obj_copy

Insert you custom logic for en-/decoding your instance by changing the

if isinstance(key, tuple):
    # stringify the key
    serialized_key = f"__tuple__{list(key)}"

in the _tuple_to_string function or the corresponding code block from the _string_to_tuple function, respectively:

if isinstance(key, str) and key.startswith("__tuple__"):
    # decode it so tuple
    serialized_key = tuple(key.split("__tuple__")[1].strip("[]").replace("'", "").split(", "))
    # if key is number in string representation
    if all(entry.isdigit() for entry in serialized_key):
        # convert to integer
        serialized_key = tuple(map(int, serialized_key))

Then, you can use it as usual:

>>> dct = {("L1", "L1"): {("L2", "L2"): "foo"}}
>>> json.dumps(_tuple_to_string(dct))
... {"__tuple__['L1', 'L2']": {"__tuple__['L2', 'L2']": "foo"}}

Hope, I could help you!

Answered By: Peter Lustig

This something that CAN NOT BE DONE. That is, the default function in json or alternatively extending the JsonEncoder approach will not work. See this issue:

https://github.com/python/cpython/issues/63020

The reason being that the developers thing that supporting anything other than strings for serialization should be disavowed.

See also:
json.dump not calling default or cls

Answered By: Heberto Mayorquin
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.