Python – How can I move a nested json dictionary up to its own index?
Question:
I have a json dataset where each item/index can contain 2 nested dictionaries. The problem is that one of these nested dictionaries contains all of the exact key:value pairs as its parent dictionary. To put it in other words, I have a parent "Account" and any time there are "Sub-Accounts" it places the Sub-Accounts in the nested dictionary, and they are never seen as their own standalone item/index.
Here is the sample json of one item/index. Essentially, I need the sub_accounts object to extracted and become its own index. As you can see, it contains all of the same key:value objects as the parent containins the sub_accounts.
{
"classification": [
{
"classificationId": "Cash",
"taxonomyId": "accounting.gp"
}
],
"id": "235",
"kind": "Real",
"name": "Checking",
"sub_accounts": [
{
"classification": [
{
"classificationId": "Cash",
"taxonomyId": "accounting.gp"
}
],
"id": "236",
"kind": "Real",
"name": "Cash Reserve",
"sub_accounts": []
}
]
},
I have been able to use json_normalize or even variations of .pop() to accomplish a flattening of data and I have tried to explore other flattening options, but with no luck on the specific task I am trying to accomplish. Those solutions usually just result with the subaccounts still be associated to the original index.
Answers:
You could use a recursive function to traverse the hierarchy while progressively popping out the "sub_accounts" keys:
def extractAccounts(accounts):
return [s for a in accounts
for s in (a,*extractAccounts(a.pop("sub_accounts",[])))]
From a list of account objects:
data = [{
"classification": [
{
"classificationId": "Cash",
"taxonomyId": "accounting.gp"
}
],
"id": "235",
"kind": "Real",
"name": "Checking",
"sub_accounts": [
{
"classification": [
{
"classificationId": "Cash",
"taxonomyId": "accounting.gp"
}
],
"id": "236",
"kind": "Real",
"name": "Cash Reserve",
"sub_accounts": []
}
]
}]
Output:
accounts = extractAccounts(data)
for i,account in enumerate(accounts):
print("Account #",i)
print(account)
Account # 0
{'classification': [{'classificationId': 'Cash', 'taxonomyId': 'accounting.gp'}], 'id': '235', 'kind': 'Real', 'name': 'Checking'}
Account # 1
{'classification': [{'classificationId': 'Cash', 'taxonomyId': 'accounting.gp'}], 'id': '236', 'kind': 'Real', 'name': 'Cash Reserve'}
If your top level is a single account (i.e. not a list), just place it in a list when calling the function: extractAccount([data])
I don’t have a generic answer, but this seems to do what you need:
raw_data = """
[
{
"classification": [
{
"classificationId": "Cash",
"taxonomyId": "accounting.gp"
}
],
"id": "235",
"kind": "Real",
"name": "Checking",
"sub_accounts": [
{
"classification": [
{
"classificationId": "Cash",
"taxonomyId": "accounting.gp"
}
],
"id": "236",
"kind": "Real",
"name": "Cash Reserve",
"sub_accounts": []
}
]
}
]
"""
import json
jdict = json.loads(raw_data)
empty_list = list()
result = list()
for elem in jdict:
sub_elem_list = elem['sub_accounts']
elem['sub_accounts'] = empty_list
result.append(elem)
for sub_elem in sub_elem_list:
result.append(sub_elem)
print(json.dumps(result, indent=4))
output = """
[
{
"classification": [
{
"classificationId": "Cash",
"taxonomyId": "accounting.gp"
}
],
"id": "235",
"kind": "Real",
"name": "Checking",
"sub_accounts": []
},
{
"classification": [
{
"classificationId": "Cash",
"taxonomyId": "accounting.gp"
}
],
"id": "236",
"kind": "Real",
"name": "Cash Reserve",
"sub_accounts": []
}
]
"""
When you have nested structures you need to nest your loops. The other answer has recursion, which can cause problems if you’re nesting over a thousand recursive calls (so probably not this case). I also assumed that you care about order, preferring the parent’s id to be first. Also, if you’re trying to get rid of the sub_accounts
from the json, then you’d want to pop it from the records, but I again assume that the structure should be maintained.
I have a json dataset where each item/index can contain 2 nested dictionaries. The problem is that one of these nested dictionaries contains all of the exact key:value pairs as its parent dictionary. To put it in other words, I have a parent "Account" and any time there are "Sub-Accounts" it places the Sub-Accounts in the nested dictionary, and they are never seen as their own standalone item/index.
Here is the sample json of one item/index. Essentially, I need the sub_accounts object to extracted and become its own index. As you can see, it contains all of the same key:value objects as the parent containins the sub_accounts.
{
"classification": [
{
"classificationId": "Cash",
"taxonomyId": "accounting.gp"
}
],
"id": "235",
"kind": "Real",
"name": "Checking",
"sub_accounts": [
{
"classification": [
{
"classificationId": "Cash",
"taxonomyId": "accounting.gp"
}
],
"id": "236",
"kind": "Real",
"name": "Cash Reserve",
"sub_accounts": []
}
]
},
I have been able to use json_normalize or even variations of .pop() to accomplish a flattening of data and I have tried to explore other flattening options, but with no luck on the specific task I am trying to accomplish. Those solutions usually just result with the subaccounts still be associated to the original index.
You could use a recursive function to traverse the hierarchy while progressively popping out the "sub_accounts" keys:
def extractAccounts(accounts):
return [s for a in accounts
for s in (a,*extractAccounts(a.pop("sub_accounts",[])))]
From a list of account objects:
data = [{
"classification": [
{
"classificationId": "Cash",
"taxonomyId": "accounting.gp"
}
],
"id": "235",
"kind": "Real",
"name": "Checking",
"sub_accounts": [
{
"classification": [
{
"classificationId": "Cash",
"taxonomyId": "accounting.gp"
}
],
"id": "236",
"kind": "Real",
"name": "Cash Reserve",
"sub_accounts": []
}
]
}]
Output:
accounts = extractAccounts(data)
for i,account in enumerate(accounts):
print("Account #",i)
print(account)
Account # 0
{'classification': [{'classificationId': 'Cash', 'taxonomyId': 'accounting.gp'}], 'id': '235', 'kind': 'Real', 'name': 'Checking'}
Account # 1
{'classification': [{'classificationId': 'Cash', 'taxonomyId': 'accounting.gp'}], 'id': '236', 'kind': 'Real', 'name': 'Cash Reserve'}
If your top level is a single account (i.e. not a list), just place it in a list when calling the function: extractAccount([data])
I don’t have a generic answer, but this seems to do what you need:
raw_data = """
[
{
"classification": [
{
"classificationId": "Cash",
"taxonomyId": "accounting.gp"
}
],
"id": "235",
"kind": "Real",
"name": "Checking",
"sub_accounts": [
{
"classification": [
{
"classificationId": "Cash",
"taxonomyId": "accounting.gp"
}
],
"id": "236",
"kind": "Real",
"name": "Cash Reserve",
"sub_accounts": []
}
]
}
]
"""
import json
jdict = json.loads(raw_data)
empty_list = list()
result = list()
for elem in jdict:
sub_elem_list = elem['sub_accounts']
elem['sub_accounts'] = empty_list
result.append(elem)
for sub_elem in sub_elem_list:
result.append(sub_elem)
print(json.dumps(result, indent=4))
output = """
[
{
"classification": [
{
"classificationId": "Cash",
"taxonomyId": "accounting.gp"
}
],
"id": "235",
"kind": "Real",
"name": "Checking",
"sub_accounts": []
},
{
"classification": [
{
"classificationId": "Cash",
"taxonomyId": "accounting.gp"
}
],
"id": "236",
"kind": "Real",
"name": "Cash Reserve",
"sub_accounts": []
}
]
"""
When you have nested structures you need to nest your loops. The other answer has recursion, which can cause problems if you’re nesting over a thousand recursive calls (so probably not this case). I also assumed that you care about order, preferring the parent’s id to be first. Also, if you’re trying to get rid of the sub_accounts
from the json, then you’d want to pop it from the records, but I again assume that the structure should be maintained.