How to groupby two columns of a dataframe and convert other columns into dict with column header as key

Question:

Dataframe:

id      id_2    salary  title   allowance   name
0420    13.28   100000  director    No      Tom
0420    13.28   70000   developer   Yes     Sam
0110    13.12   120000  director    No      Dave
0110    13.12   75000   developer   Yes     shaun 

Groupby id and id_2 and convert the rest of columns to dict with column header

I wrote a loop for this, I think don’t think it is python way, Please let me know how to do it with pandas.

Required output:

[{
            "id": 420,
            "id_2": 13.28,
            "attributes":[
                    {   "salary": 100000,
                        "title":"director",
                        "allowance":"No",
                        "name": "Tom"
                    },
                    {   "salary": 70000,
                        "title": "developer",
                        "allowance":"Yes",
                        "name": "Sam"
                    }
                ]
            },
            {
            "id": 110,
            "id_2": 13.12,
            "attributes":[
                    {   "salary": 120000,
                        "title":"director",
                        "allowance":"No",
                        "name": "Dave"
                    },
                    {   "salary": 75000,
                        "title": "developer",
                        "allowance":"Yes",
                        "name": "shaun"
                    }
                ]
            }   
]
Asked By: Vamsi Nimmala

||

Answers:

  • There is not a one-liner pandas argument that will provide a list of dicts in the shape you’re requesting.
  • Use .groupby to select the groups
    • g is the tuple that represents the values used to groupby
    • d is the dataframe for the groupby value, g
  • Use .iterrows to iterate through the rows of each group
    • Returns index represented by the first _, because it’s not needed
    • Returns data, from which the labels in groupby_list are dropped, then convert the remainder to a dict using .to_dict(), and append it to the list, att_list
    • After looping through all rows of the group, assign att_list as the value to group['attributes']
  • Once each group has been iterated through, append the dict, group, to dict_list.
  • dict_list can be converted back to a dataframe with the following:
    • df = pd.json_normalize(dict_list, 'attributes', meta=groupby_list)
dict_list = list()
groupby_list = ['id', 'id_2']
for g, d in df.groupby(groupby_list):
    group = dict(zip(groupby_list, g))
    att_list = list()
    for _, data in d.iterrows():
        data = data.drop(labels=groupby_list)
        att_list.append(data.to_dict())
    group['attributes'] = att_list
    dict_list.append(group)

dict_list:

[{'attributes': [{'allowance': 'No',
                  'name': 'Dave',
                  'salary': 120000,
                  'title': 'director'},
                 {'allowance': 'Yes',
                  'name': 'shaun',
                  'salary': 75000,
                  'title': 'developer'}],
  'id': 110,
  'id_2': 13.12},
 {'attributes': [{'allowance': 'No',
                  'name': 'Tom',
                  'salary': 100000,
                  'title': 'director'},
                 {'allowance': 'Yes',
                  'name': 'Sam',
                  'salary': 70000,
                  'title': 'developer'}],
  'id': 420,
  'id_2': 13.28}]
Answered By: Trenton McKinney
df1.groupby(['id','id_2']).apply(lambda dd:dd.iloc[:,2:].to_dict('r'))
    .to_frame("attributes").reset_index().to_dict('r')

out:

[{'id': 110,
  'id_2': 13.12,
  'attributes': [{'salary': 120000,
    'title': 'director',
    'allowance': 'No',
    'name': 'Dave'},
   {'salary': 75000,
    'title': 'developer',
    'allowance': 'Yes',
    'name': 'shaun'}]},
 {'id': 420,
  'id_2': 13.28,
  'attributes': [{'salary': 100000,
    'title': 'director',
    'allowance': 'No',
    'name': 'Tom'},
   {'salary': 70000,
    'title': 'developer',
    'allowance': 'Yes',
    'name': 'Sam'}]}]
Answered By: G.G
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.