How do I convert a json file to a python class?

Question:

Consider this json file named h.json I want to convert this into a python dataclass.

{
    "acc1":{
        "email":"[email protected]",
        "password":"acc1",
        "name":"ACC1",
        "salary":1
    },
    "acc2":{
        "email":"[email protected]",
        "password":"acc2",
        "name":"ACC2",
        "salary":2
    }

}

I could use an alternative constructor for getting each account, for example:

import json
from dataclasses import dataclass

@dataclass
class Account(object):
    email:str
    password:str
    name:str
    salary:int
    
    @classmethod
    def from_json(cls, json_key):
        file = json.load(open("h.json"))
        return cls(**file[json_key])

but this is limited to what arguments (email, name, etc.) were defined in the dataclass.

What if I were to modify the json to include another thing, say age?
The script would end up returning a TypeError, specifically TypeError: __init__() got an unexpected keyword argument 'age'.

Is there a way to dynamically adjust the class attributes based on the keys of the dict (json object), so that I don’t have to add attributes each time I add a new key to the json?

Asked By: Kanishk

||

Answers:

For a flat (not nested dataclass) the code below does the job.
If you need to handle nested dataclasses you should use a framework like dacite.
Note 1 that loading the data from the json file should not be part of your class logic.

Note 2 If your json can contain anything – you can not map it to a dataclass and you should have to work with a dict

from dataclasses import dataclass
from typing import List

data = {
    "acc1":{
        "email":"[email protected]",
        "password":"acc1",
        "name":"ACC1",
        "salary":1
    },
    "acc2":{
        "email":"[email protected]",
        "password":"acc2",
        "name":"ACC2",
        "salary":2
    }

}



@dataclass
class Account:
    email:str
    password:str
    name:str
    salary:int

accounts: List[Account] = [Account(**x) for x in data.values()]
print(accounts)

output

[Account(email='[email protected]', password='acc1', name='ACC1', salary=1), Account(email='[email protected]', password='acc2', name='ACC2', salary=2)]
Answered By: balderman

This way you lose some dataclass features.

  • Such as determining whether it is optional or not
  • Such as auto-completion feature

However, you are more familiar with your project and decide accordingly

There must be many methods, but this is one of them:

@dataclass
class Account(object):
    email: str
    password: str
    name: str
    salary: int

    @classmethod
    def from_json(cls, json_key):
        file = json.load(open("1.txt"))
        keys = [f.name for f in fields(cls)]
        # or: keys = cls.__dataclass_fields__.keys()
        json_data = file[json_key]
        normal_json_data = {key: json_data[key] for key in json_data if key in keys}
        anormal_json_data = {key: json_data[key] for key in json_data if key not in keys}
        tmp = cls(**normal_json_data)
        for anormal_key in anormal_json_data:
            setattr(tmp,anormal_key,anormal_json_data[anormal_key])
        return tmp

test = Account.from_json("acc1")
print(test.age)
Answered By: PersianMan

Since it sounds like your data might be expected to be dynamic and you want the freedom to add more fields in the JSON object without reflecting the same changes in the model, I’d also suggest to check out typing.TypedDict instead a dataclass.

Here’s an example with TypedDict, which should work in Python 3.7+. Since TypedDict was introduced in 3.8, I’ve instead imported it from typing_extensions so it’s compatible with 3.7 code.

from __future__ import annotations

import json
from io import StringIO
from typing_extensions import TypedDict


class Account(TypedDict):
    email: str
    password: str
    name: str
    salary: int


json_data = StringIO("""{
    "acc1":{
        "email":"[email protected]",
        "password":"acc1",
        "name":"ACC1",
        "salary":1
    },
    "acc2":{
        "email":"[email protected]",
        "password":"acc2",
        "name":"ACC2",
        "salary":2,
        "someRandomKey": "string"
    }
}
""")

data = json.load(json_data)
name_to_account: dict[str, Account] = data

acct = name_to_account['acc2']

# Your IDE should be able to offer auto-complete suggestions within the
# brackets, when you start typing or press 'Ctrl + Space' for example.
print(acct['someRandomKey'])

If you are set on using dataclasses to model your data, I’d suggest checking out a JSON serialization library like the dataclass-wizard (disclaimer: I am the creator) which should handle extraneous fields in the JSON data as mentioned, as well as a nested dataclass model if you find your data becoming more complex.

It also has a handy tool that you can use to generate a dataclass schema from JSON data, which can be useful for instance if you want to update your model class whenever you add new fields in the JSON file as mentioned.

Answered By: rv.kvetch