Deserialize a json string to an object in python

Question:

I have the following string

{"action":"print","method":"onData","data":"Madan Mohan"}

I Want to deserialize to a object of class

class payload
    string action
    string method
    string data

I am using python 2.6 and 2.7

Asked By: user1855484

||

Answers:

You can specialize an encoder for object creation: http://docs.python.org/2/library/json.html

import json
class ComplexEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, complex):
            return {"real": obj.real,
            "imag": obj.imag,
            "__class__": "complex"}
        return json.JSONEncoder.default(self, obj)

print json.dumps(2 + 1j, cls=ComplexEncoder)
Answered By: Sami N
>>> j = '{"action": "print", "method": "onData", "data": "Madan Mohan"}'
>>> import json
>>> 
>>> class Payload(object):
...     def __init__(self, j):
...         self.__dict__ = json.loads(j)
... 
>>> p = Payload(j)
>>>
>>> p.action
'print'
>>> p.method
'onData'
>>> p.data
'Madan Mohan'
Answered By: John La Rooy

To elaborate on Sami’s answer:

From the docs:

class Payload(object):
    def __init__(self, action, method, data):
        self.action = action
        self.method = method
        self.data = data

import json

def as_payload(dct):
    return Payload(dct['action'], dct['method'], dct['data'])

payload = json.loads(message, object_hook = as_payload)

My objection to the

.__dict__ 

solution is that while it does the job and is concise, the Payload class becomes totally generic – it doesn’t document its fields.

For example, if the Payload message had an unexpected format, instead of throwing a key not found error when the Payload was created, no error would be generated until the payload was used.

Answered By: Alex

If you want to save lines of code and leave the most flexible solution, we can deserialize the json string to a dynamic object:

p = lambda:None
p.__dict__ = json.loads('{"action": "print", "method": "onData", "data": "Madan Mohan"}')

>>>> p.action

output: u’print’

>>>> p.method
output: u’onData’

Answered By: Nery Jr

I prefer to add some checking of the fields, e.g. so you can catch errors like when you get invalid json, or not the json you were expecting, so I used namedtuples:

from collections import namedtuple
payload = namedtuple('payload', ['action', 'method', 'data'])
def deserialize_payload(json):
    kwargs =  dict([(field, json[field]) for field in payload._fields]) 
    return payload(**kwargs)

this will let give you nice errors when the json you are parsing does not match the thing you want it to parse

>>> json = {"action":"print","method":"onData","data":"Madan Mohan"}
>>> deserialize_payload(json)
payload(action='print', method='onData', data='Madan Mohan')
>>> badjson = {"error":"404","info":"page not found"}
>>> deserialize_payload(badjson)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 2, in deserialize_payload
KeyError: 'action'

if you want to parse nested relations, e.g. '{"parent":{"child":{"name":"henry"}}}'
you can still use the namedtuples, and even a more reusable function

Person = namedtuple("Person", ['parent'])
Parent = namedtuple("Parent", ['child'])
Child = namedtuple('Child', ['name'])
def deserialize_json_to_namedtuple(json, namedtuple):
    return namedtuple(**dict([(field, json[field]) for field in namedtuple._fields]))

def deserialize_person(json):
     json['parent']['child']  = deserialize_json_to_namedtuple(json['parent']['child'], Child)
     json['parent'] =  deserialize_json_to_namedtuple(json['parent'], Parent) 
     person = deserialize_json_to_namedtuple(json, Person)
     return person

giving you

>>> deserialize_person({"parent":{"child":{"name":"henry"}}})
Person(parent=Parent(child=Child(name='henry')))
>>> deserialize_person({"error":"404","info":"page not found"})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 2, in deserialize_person
KeyError: 'parent'
Answered By: Jens Timmerman

If you are embracing the type hints in Python 3.6, you can do it like this:

def from_json(data, cls):
    annotations: dict = cls.__annotations__ if hasattr(cls, '__annotations__') else None
    if issubclass(cls, List):
        list_type = cls.__args__[0]
        instance: list = list()
        for value in data:
            instance.append(from_json(value, list_type))
        return instance
    elif issubclass(cls, Dict):
            key_type = cls.__args__[0]
            val_type = cls.__args__[1]
            instance: dict = dict()
            for key, value in data.items():
                instance.update(from_json(key, key_type), from_json(value, val_type))
            return instance
    else:
        instance : cls = cls()
        for name, value in data.items():
            field_type = annotations.get(name)
            if inspect.isclass(field_type) and isinstance(value, (dict, tuple, list, set, frozenset)):
                setattr(instance, name, from_json(value, field_type))
            else:
                setattr(instance, name, value)
        return instance

Which then allows you do instantiate typed objects like this:

class Bar:
    value : int

class Foo:
    x : int
    bar : List[Bar]


obj : Foo = from_json(json.loads('{"x": 123, "bar":[{"value": 3}, {"value": 2}, {"value": 1}]}'), Foo)
print(obj.x)
print(obj.bar[2].value)

This syntax requires Python 3.6 though and does not cover all cases – for example, support for typing.Any… But at least it does not pollute the classes that need to be deserialized with extra init/tojson methods.

Answered By: GaspardP

I thought I lose all my hairs for solving this ‘challenge’. I faced following problems:

  1. How to deserialize nested objects, lists etc.
  2. I like constructors with specified fields
  3. I don’t like dynamic fields
  4. I don’t like hacky solutions

I found a library called jsonpickle which is has proven to be really useful.

Installation:

pip install jsonpickle

Here is a code example with writing nested objects to file:

import jsonpickle


class SubObject:
    def __init__(self, sub_name, sub_age):
        self.sub_name = sub_name
        self.sub_age = sub_age


class TestClass:

    def __init__(self, name, age, sub_object):
        self.name = name
        self.age = age
        self.sub_object = sub_object


john_junior = SubObject("John jr.", 2)

john = TestClass("John", 21, john_junior)

file_name = 'JohnWithSon' + '.json'

john_string = jsonpickle.encode(john)

with open(file_name, 'w') as fp:
    fp.write(john_string)

john_from_file = open(file_name).read()

test_class_2 = jsonpickle.decode(john_from_file)

print(test_class_2.name)
print(test_class_2.age)
print(test_class_2.sub_object.sub_name)

Output:

John
21
John jr.

Website: http://jsonpickle.github.io/

Hope it will save your time (and hairs).

Answered By: LukaszTaraszka

Another way is to simply pass the json string as a dict to the constructor of your object. For example your object is:

class Payload(object):
    def __init__(self, action, method, data, *args, **kwargs):
        self.action = action
        self.method = method
        self.data = data

And the following two lines of python code will construct it:

j = json.loads(yourJsonString)
payload = Payload(**j)

Basically, we first create a generic json object from the json string. Then, we pass the generic json object as a dict to the constructor of the Payload class. The constructor of Payload class interprets the dict as keyword arguments and sets all the appropriate fields.

Answered By: rouble

In recent versions of python, you can use marshmallow-dataclass:

from marshmallow_dataclass import dataclass

@dataclass
class Payload
    action:str
    method:str
    data:str

Payload.Schema().load({"action":"print","method":"onData","data":"Madan Mohan"})
Answered By: lovasoa

While Alex’s answer points us to a good technique, the implementation that he gave runs into a problem when we have nested objects.

class more_info
    string status

class payload
    string action
    string method
    string data
    class more_info

with the below code:

def as_more_info(dct):
    return MoreInfo(dct['status'])

def as_payload(dct):
    return Payload(dct['action'], dct['method'], dct['data'], as_more_info(dct['more_info']))

payload = json.loads(message, object_hook = as_payload)

payload.more_info will also be treated as an instance of payload which will lead to parsing errors.

From the official docs:

object_hook is an optional function that will be called with the result of any object literal decoded (a dict). The return value of object_hook will be used instead of the dict.

Hence, I would prefer to propose the following solution instead:

class MoreInfo(object):
    def __init__(self, status):
        self.status = status

    @staticmethod
    def fromJson(mapping):
        if mapping is None:
            return None

        return MoreInfo(
            mapping.get('status')
        )

class Payload(object):
    def __init__(self, action, method, data, more_info):
        self.action = action
        self.method = method
        self.data = data
        self.more_info = more_info

    @staticmethod
    def fromJson(mapping):
        if mapping is None:
            return None

        return Payload(
            mapping.get('action'),
            mapping.get('method'),
            mapping.get('data'),
            MoreInfo.fromJson(mapping.get('more_info'))
        )

import json
def toJson(obj, **kwargs):
    return json.dumps(obj, default=lambda j: j.__dict__, **kwargs)

def fromJson(msg, cls, **kwargs):
    return cls.fromJson(json.loads(msg, **kwargs))

info = MoreInfo('ok')
payload = Payload('print', 'onData', 'better_solution', info)
pl_json = toJson(payload)
l1 = fromJson(pl_json, Payload)

Answered By: 6harat

There are different methods to deserialize json string to an object. All above methods are acceptable but I suggest using a library to prevent duplicate key issues or serializing/deserializing of nested objects.

Pykson, is a JSON Serializer and Deserializer for Python which can help you achieve. Simply define Payload class model as JsonObject then use Pykson to convert json string to object.

from pykson import Pykson, JsonObject, StringField

class Payload(pykson.JsonObject):
    action = StringField()
    method = StringField()
    data = StringField()

json_text = '{"action":"print","method":"onData","data":"Madan Mohan"}'
payload = Pykson.from_json(json_text, Payload)
Answered By: Sina Rezaei

pydantic is an increasingly popular library for python 3.6+ projects. It mainly does data validation and settings management using type hints.

A basic example using different types:

from pydantic import BaseModel

class ClassicBar(BaseModel):
    count_drinks: int
    is_open: bool
 
data = {'count_drinks': '226', 'is_open': 'False'}
cb = ClassicBar(**data)
>>> cb
ClassicBar(count_drinks=226, is_open=False)

What I love about the lib is that you get a lot of goodies for free, like

>>> cb.json()
'{"count_drinks": 226, "is_open": false}'
>>> cb.dict()
{'count_drinks': 226, 'is_open': False}
Answered By: LarsVegas
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.