Python Create dataframe from nested dict with lists

Question:

I am trying to create a dataframe / csv that looks like this

App id stages requestCpu requestMemory
appName 123 dev 1000 1024
appName 123 staging 3200 1024

The dict data looks like this and includes quite a lot of apps, however all the data inside the apps looks the same with the dict layout:

test_data = {"appName": {"id": "123", "stages": {"dev": [{"request.cpu": 1000}, {"request.memory": 1024}], "staging": [{"request.cpu": 3200}, {"request.memory": 1024}]}}, "appName2"...}

I used something like this before:

df = pd.DataFrame.from_dict(test_data, orient='index')
df = pd.concat([df.drop(['stages'], axis=1), (df['stages'].apply(pd.Series))], axis=1)
df.index.name = "App"

However this wasn’t able to split up the list part and also the stages were now in columns so not how i wanted it to look..

Any help much appreciated, thanks

Asked By: N1ckson

||

Answers:

Easiest solution would be to iterate the rows prior to loading it with pandas:

import pandas as pd

test_data = {"appName": {"id": "123", "stages": {"dev": [{"request.cpu": 1000}, {"request.memory": 1024}], "staging": [{"request.cpu": 3200}, {"request.memory": 1024}]}}, "appName2": {"id": "456", "stages": {"dev": [{"request.cpu": 1000}, {"request.memory": 1024}], "staging": [{"request.cpu": 3200}, {"request.memory": 1024}]}}}


rows = []

for app, app_data in test_data.items():
    for stage, stage_data in app_data["stages"].items():
        row = {
            "App": app,
            "id": app_data["id"],
            "stages": stage
        }
        for metric in stage_data:
            metric_name, metric_value = list(metric.items())[0]
            row[metric_name] = metric_value
        rows.append(row)

df = pd.json_normalize(rows)

# Reorder columns 
df = df[["App", "id", "stages", "request.cpu", "request.memory"]]

Output:

App id stages request.cpu request.memory
0 appName 123 dev 1000 1024
1 appName 123 staging 3200 1024
2 appName2 456 dev 1000 1024
3 appName2 456 staging 3200 1024
Answered By: RJ Adriaansen
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.