How to unwrap a JSON column and split the attributes into seperate columns with live streaming data

Question:

DATASET

ERROR

df = pd.DataFrame.from_dict(df)
split = df['device_data'].apply(lambda x: pd.json_normalize(json.loads(x)))

a = df.drop(columns=['device_data'])
b = pd.concat(list(split), ignore_index=True)
df = a.join(b)

This is the code I am attempting to run. It works when I run a csv file but I am continuously streaming data from DyanmoDB and this code will not work and I receive the following error when attempting to do so.

FULL CODE

Asked By: Christopher McHugh

||

Answers:

If it already contains dict then it should be easier to work :

a = [{'temperature':20,'humidity':30,'pressure':180},{'temperature':30,'humidity':90,'pressure':120}] 

data = pd.DataFrame()

data["new"] = a

split = data["new"].apply(pd.Series)

temperature |   humidity |  pressure
0   20  30  180
1   30  90  120

Therefore i think your code should be:

df = pd.DataFrame.from_dict(df)
split = df['device_data'].apply(pd.Series)

df =  pd.concat([df.drop(columns=['device_data']),split],axis=1)
Answered By: Ugur Yigit