Converting dictionary of lists of dictionaries to a dataframe

Question

Say I have a dict defined as:

dict = {'1': [{'name': 'Hospital 0',
               'students': 5,
               'grad': 71},
                    
              {'name': 'Hospital 1',
               'students': 8,
               'grad': 74}],
        
        '2': [{'name': 'Hospital 0',
               'students': 11,
               'grad': 72}]
                    
               {'name': 'Hospital 1',
               'students': 10,
               'grad': 78}]}

Suppose I want to make a dataframe from this formatted as follows:

step	name	students	grad
1	Hospital 0	5	71
1	Hospital 1	8	74
2	Hospital 0	11	72
2	Hospital 1	10	78

Do you guys have any ideas?

Asked By: PurpleSky

||

Source

Answer 1

— Try to use the pandas.DataFrame,
The headers, [step name students grad]

import pandas as pd

data = []

for key, value in dict.items():
    for elem in value:
        row = {
            'Step': key,
            'Hospital Name': elem['name'],
            'Students': elem['students'],
            'Grad': elem['grad']
        }
        data. Append(row)

df = pd.DataFrame(data)

Answered By: Hope

Answer 2

Here is an approach using json_normalize()
Note: I am using data as variable name instead of dict which is python built-in function.

from pandas import json_normalize
import pandas as pd 

dfs = [json_normalize(data[key]).assign(step=key) for key in data if "name" in data[key][0]]
df = pd.concat(dfs, ignore_index=True)
df = df[["step", "name", "students", "grad"]]
print(df)

  step        name  students  grad
0    1  Hospital 0         5    71
1    1  Hospital 1         8    74
2    2  Hospital 0        11    72
3    2  Hospital 1        10    78

Answered By: Jamiu S.

Answer 3

Here is some documentation on Pandas DataFrames:
https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html

You can also get documentation from the Python shell:

import pandas as pd
help(pd.DataFrame)

The documentation gives this example:

 |  Examples
 |  --------
 |  Constructing DataFrame from a dictionary.
 |  
 |  >>> d = {'col1': [1, 2], 'col2': [3, 4]}
 |  >>> df = pd.DataFrame(data=d)
 |  >>> df
 |     col1  col2
 |  0     1     3
 |  1     2     4

We can format your data in a slightly different way to make it easier.

% python
>>> import pandas as pd
>>> d = {}
>>> d['step'] = [1, 1, 2, 2]
>>> d['name'] = ['Hospital 0', 'Hospital 1', 'Hospital 0', 'Hospital 1']
>>> d['students'] = [5, 8, 11, 10]
>>> d['grad'] = [71, 74, 72, 78]
>>> df = pd.DataFrame(d)
>>> print(df.to_string(index=False))
 step        name  students  grad
    1  Hospital 0         5    71
    1  Hospital 1         8    74
    2  Hospital 0        11    72
    2  Hospital 1        10    78

One solution is to structure the dictionary so that it meets the requirements of the DataFrame constructor. The code above is based on the example from the Pandas documentation.

Answered By: ktm5124

Answer 4

using pandas library seems the best option for your issue. Hope the code below will be helpful.

import pandas as pd
df =pd.DataFrame(columns=['step','name','students','grad'])
keys_values = list(dicta.keys())
ind = 0
for key in keys_values:
    rows = dicta[key]
    for row in rows:
        df.loc[ind] = [key, row['name'], row['students'], row['grad']]
        ind += 1
print(df)

Answered By: Wend Yam D. Davy Ouedraogo

Converting dictionary of lists of dictionaries to a dataframe

Question:

Answers: