How to convert to a Python datetime object with JSON.loads?

Question:

I have a string representation of a JSON object.

dumped_dict = '{"debug": false, "created_at": "2020-08-09T11:24:20"}'

When I call json.loads with this object;

json.loads(dumped_dict)

I get;

{'created_at': '2020-08-09T11:24:20', 'debug': False}

There is nothing wrong in here. However, I want to know if there is a way to convert the above object with json.loads to something like this:

{'created_at': datetime.datetime(2020, 08, 09, 11, 24, 20), 'debug': False}

Shortly, are we able to convert datetime strings to actual datetime.datetime objects while
calling json.loads?

Asked By: Ozgur Vatansever

||

Answers:

As far as I know there is no out of the box solution for this.

First of all, the solution should take into account json schema to correctly distinguish between strings and datetimes. To some extent you can guess schema with json schema inferencer (google for json schema inferencer github) and then fix the places which are really datetimes.

If the schema is known, it should be pretty easy to make a function, which parses json and substitutes string representations with datetime. Some inspiration for the code could perhaps be found from validictory product (and json schema validation could be also good idea).

Answered By: Roman Susi

The way that your question is put, there is no indication to json that the string is a date value. This is different than the documentation of json which has the example string:

'{"__complex__": true, "real": 1, "imag": 2}'

This string has an indicator "__complex__": true that can be used to infer the type of the data, but unless there is such an indicator, a string is just a string, and all you can do is to regexp your way through all strings and decide whether they look like dates.

In your case you should definitely use a schema if one is available for your format.

Answered By: Dov Grobgeld

My solution so far:

>>> json_string = '{"last_updated": {"$gte": "Thu, 1 Mar 2012 10:00:49 UTC"}}'
>>> dct = json.loads(json_string, object_hook=datetime_parser)
>>> dct
{u'last_updated': {u'$gte': datetime.datetime(2012, 3, 1, 10, 0, 49)}}


def datetime_parser(dct):
    for k, v in dct.items():
        if isinstance(v, basestring) and re.search(" UTC", v):
            try:
                dct[k] = datetime.datetime.strptime(v, DATE_FORMAT)
            except:
                pass
    return dct

For further reference on the use of object_hook: JSON encoder and decoder

In my case the json string is coming from a GET request to my REST API. This solution allows me to ‘get the date right’ transparently, without forcing clients and users into hardcoding prefixes like __date__ into the JSON, as long as the input string conforms to DATE_FORMAT which is:

DATE_FORMAT = '%a, %d %b %Y %H:%M:%S UTC'

The regex pattern should probably be further refined

PS: in case you are wondering, the json_string is a MongoDB/PyMongo query.

Answered By: Nicola Iarocci

You need to pass an object_hook. From the documentation:

object_hook is an optional function that will be called with the
result of any object literal decoded (a dict). The return value of
object_hook will be used instead of the dict.

Like this:

import datetime
import json

def date_hook(json_dict):
    for (key, value) in json_dict.items():
        try:
            json_dict[key] = datetime.datetime.strptime(value, "%Y-%m-%dT%H:%M:%S")
        except:
            pass
    return json_dict

dumped_dict = '{"debug": false, "created_at": "2020-08-09T11:24:20"}'
loaded_dict = json.loads(dumped_dict, object_hook=date_hook)

If you also want to handle timezones you’ll have to use dateutil instead of strptime.

Answered By: galarant

You could use regex to determine whether or not you want to convert a certain field to datetime like so:

def date_hook(json_dict):
    for (key, value) in json_dict.items():
        if type(value) is str and re.match('^d{4}-d{2}-d{2}Td{2}:d{2}:d{2}.d*$', value):
            json_dict[key] = datetime.datetime.strptime(value, "%Y-%m-%dT%H:%M:%S.%f")
        elif type(value) is str and re.match('^d{4}-d{2}-d{2}Td{2}:d{2}:d{2}$', value):
            json_dict[key] = datetime.datetime.strptime(value, "%Y-%m-%dT%H:%M:%S")
        else:
            pass

    return json_dict

Then you can reference the date_hook function using the object_hook parameter in your call to json.loads():

json_data = '{"token": "faUIO/389KLDLA", "created_at": "2016-09-15T09:54:20.564"}'
data_dictionary = json.loads(json_data, object_hook=date_hook)
Answered By: Maciej

I would do the same as Nicola suggested with 2 changes:

  1. Use dateutil.parser instead of datetime.datetime.strptime
  2. Define explicitly which exceptions I want to catch. I generally recommend avoiding at all cost having an empty except:

Or in code:

import dateutil.parser

def datetime_parser(json_dict):
    for (key, value) in json_dict.items():
        try:
            json_dict[key] = dateutil.parser.parse(value)
        except (ValueError, AttributeError):
            pass
    return json_dict

str = "{...}"  # Some JSON with date
obj = json.loads(str, object_hook=datetime_parser)
print(obj)
Answered By: Uri Shalit

Inspired by Nicola’s answer and adapted to python3 (str instead of basestring):

import re
from datetime import datetime
datetime_format = "%Y-%m-%dT%H:%M:%S"
datetime_format_regex = re.compile(r'^d{4}-d{2}-d{2}Td{2}:d{2}:d{2}$')


def datetime_parser(dct):
    for k, v in dct.items():
        if isinstance(v, str) and datetime_format_regex.match(v):
            dct[k] = datetime.strptime(v, datetime_format)
    return dct

This avoids using a try/except mechanism.
On OP’s test code:

>>> import json
>>> json_string = '{"debug": false, "created_at": "2020-08-09T11:24:20"}'
>>> json.loads(json_string, object_hook=datetime_parser)
{'created_at': datetime.datetime(2020, 8, 9, 11, 24, 20), 'debug': False}

The regex and datetime_format variables can be easily adapted to fit other patterns, e.g. without the T in the middle.

To convert a string saved in isoformat (therefore stored with microseconds) back to a datetime object, refer to this question.

Answered By: Alberto Chiusole

The method implements recursive string search in date-time format

import json
from dateutil.parser import parse

def datetime_parser(value):
    if isinstance(value, dict):
        for k, v in value.items():
            value[k] = datetime_parser(v)
    elif isinstance(value, list):
        for index, row in enumerate(value):
            value[index] = datetime_parser(row)
    elif isinstance(value, str) and value:
        try:
            value = parse(value)
        except (ValueError, AttributeError):
            pass
    return value

json_to_dict = json.loads(YOUR_JSON_STRING, object_hook=datetime_parser)
Answered By: Maksim Senchuk

Although it technically works just to give the an object hook function, I recommend to use a proper subclass of JSONDecoder as it is intended by the framework developers:

class _JSONDecoder(json.JSONDecoder):
    def __init__(self, *args, **kwargs):
        json.JSONDecoder.__init__(
            self, object_hook=self.object_hook, *args, **kwargs)

    def object_hook(self, obj):
        ret = {}
        for key, value in obj.items():
            if key in {'timestamp', 'whatever'}:
                ret[key] = datetime.fromisoformat(value) 
            else:
                ret[key] = value
        return ret

For the sake of completeness, here is the counterpart to the decoder, the custom JSONEncoder:

class _JSONEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, (datetime.date, datetime.datetime, pd.Timestamp)):
            return obj.isoformat()
        return json.JSONEncoder.default(obj)

Both in action look like:

json_str = json.dumps({'timestamp': datetime.datetime.now()}, cls=_JSONEncoder)
d = json.loads(json_str, cls=_JSONDecoder)
Answered By: Michael Dorner

The solutions that suggest creating a JSON encoder and decoder are all perfectly valid. The only thing I can see wrong with this is a slight performance impact, which might happen if you’re scanning each JSON value to check to match against a date/time format.

Here’s the approach I would take, using the dataclass-wizard library (note: it is designed to work for API responses actually)

  1. Use the included CLI utility to convert the JSON response to a dataclass schema. Note that the value of debug is encoded as a string here, so I’m passing -f so that it force-resolves to a Python bool type. Otherwise, it should appear as Union[bool, str], which is the default inferred type.

    $ echo '{"debug": "false", "created_at": "2020-08-09T11:24:20"}' | wiz gs -f
    

    Output, including the imports at the top (not shown):

    @dataclass
    class Data(JSONWizard):
        """
        Data dataclass
    
        """
        debug: bool
        created_at: datetime
    
  2. Now we can de-serialize the sample JSON string above into a Data object. Note that
    created_at should come as datetime type. Similarly with the value for debug, it should be decoded as bool.

    string = """{"debug": "false", "created_at": "2020-08-09T11:24:20"}"""
    
    c = Data.from_json(string)
    
    print(repr(c))
    
  3. Serialize it back to JSON. The datetime object should be converted back
    a string:

    print(c.to_json())
    # {"debug": false, "createdAt": "2020-08-09T11:24:20"}
    
Answered By: rv.kvetch

In most of the cases, this is a two way problem, if you make use of a custom encoder you’ll probably want to have a custom decoder (and vice-versa). In this case the decoder should be able to parse the encoded data and return the original json object.

Below there’s a ful excersise to convert python non-serializable objects to json using 2 different strategies:

  1. Patching the JSONEncoder class to serializa any class that implements a "json" method to serialize classes.
  2. Using a list of "Converters" methods to seralize specific python types.

in the example below, I serialize a Enum class using a custom json method as {enum.name: enum.value} dict, here the enun.value objects are non serializable types in python (date and tuple), by using the methods listed CONVERTERS we can convert these types to serializable types.

Once encoded, the custom_json_decoder method can be invoked to convert that json back to python primitive types.
This script exaple below is complete, it should run "as is":

from enum import Enum
from dateutil.parser import parse as dtparse
from datetime import datetime
from datetime import date
from json import JSONEncoder
from json import loads as json_loads
from json import dumps as json_dumps


def wrapped_default(self, obj):
    json_parser = getattr(obj.__class__, "__json__", lambda x: x.__dict__)
    try:
        return json_parser(obj)
    except Exception:
        return wrapped_default.default(obj)


wrapped_default.default = JSONEncoder().default
JSONEncoder.default = wrapped_default

CONVERTERS = {
    "datetime": dtparse,
    "date": lambda x: datetime.strptime(x, "%Y%m%d").date(),
    "tuple": lambda x: tuple(x),
}


class RskJSONEncoder(JSONEncoder):
    def default(self, obj):
        if isinstance(obj, date):
            return {"val": obj.strftime("%Y%m%d"), "pythontype": "date"}
        elif isinstance(obj, datetime):
            return {"val": obj.isoformat(), "pythontype": "datetime"}
        elif isinstance(obj, tuple):
            return {"val": list(obj), "pythontype": "tuple"}
        return super().default(obj)


def custom_json_decoder(obj):
    def json_hook(json_obj):
        try:
            return CONVERTERS[json_obj.pop("pythontype")](json_obj["val"])
        except Exception:
            res = json_obj
        return res

    return json_loads(obj, object_hook=json_hook)


def custom_json_encoder(obj):
    return json_dumps(obj, cls=RskJSONEncoder)


if __name__ == "__main__":

    class Test(Enum):
        A = date(2021, 1, 1)
        B = ("this", " is", " a", " tuple")

        def __json__(self):
            return {self.name: self.value}

    d = {"enum_date": Test.A, "enum_tuple": Test.B}
    this_is_json = custom_json_encoder(d)
    this_is_python_obj = custom_json_decoder(this_is_json)
    print(f"this is json, type={type(this_is_json)}n", this_is_json)
    print(
        f"this is python, type={type(this_is_python_obj)}n",
        this_is_python_obj,
    )
Answered By: Bravhek

if you are looking for django json steriliser:

from django.utils.timezone import now
from django.core.serializers.json import DjangoJSONEncoder
from django.utils.dateparse import parse_datetime

dt = now()
sdt = json.dumps(dt.strftime('%Y-%m-%dT%H:%M:%S'))
ndt = parse_datetime(json.loads(sdt))
print(sdt)
# "2022-04-27T12:20:23"
print(ndt)
# 2022-04-27 12:20:23
Answered By: Weilory
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.