pandas df turn rows containing dict lists into new df and keep id

Question:

I hope I can describe the problem clearly.
I have a pandas df with 2 columns (date, person) that looks like this:

    date          person
0   2002-09-04    [{'name':'anna', 'weight':'2.9', 'hospital':'x'}, {'name': 'jacob', ...}, ...] 
1   2002-10-16    [{'name':'lynn', 'weight':'3.0', 'hospital':'y'}, {'name': 'tony', ...}, ...] 
2   ...
3

and want to turn it into this:

    date          name    weight    hospital
0   2002-09-04    anna    2.9       x
1   2002-09-04    jacob   ...       ...
.
.
.
n   2002-10-16    lynn    3.0       y
n1  2002-10-16    tony    ...       ...

Originally there are more columns with important information. Therefore I want to create a new df to make the data in ‘person’ more accessible.
I tried appending the dict lists row by row.

So far I have:

df_person = pd.DataFrame()


for row, _ in enumerate(df['person']):
    df_person = df_person.append(df['person'][row], ignore_index = True, sort = False)

This gives me the df with name, weight and hospital as columns. But how can I now keep the right dates for each entry?

Asked By: Jana

||

Answers:

You can explode your list of dicionaries and convert them to column with pandas.json_normalize:

out = (df
   .explode('person', ignore_index=True)
   .pipe(lambda d: d.join(pd.json_normalize(d.pop('person'))))
)

output:

         date   name weight hospital
0  2002-09-04   anna    2.9        x
1  2002-09-04  jacob    NaN      NaN
2  2002-10-16   lynn    3.0        y
3  2002-10-16   tony    NaN      NaN
Answered By: mozway
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.