How to unwrap a JSON column and split the attributes into seperate columns with live streaming data
Question:
df = pd.DataFrame.from_dict(df)
split = df['device_data'].apply(lambda x: pd.json_normalize(json.loads(x)))
a = df.drop(columns=['device_data'])
b = pd.concat(list(split), ignore_index=True)
df = a.join(b)
This is the code I am attempting to run. It works when I run a csv file but I am continuously streaming data from DyanmoDB and this code will not work and I receive the following error when attempting to do so.
Answers:
If it already contains dict then it should be easier to work :
a = [{'temperature':20,'humidity':30,'pressure':180},{'temperature':30,'humidity':90,'pressure':120}]
data = pd.DataFrame()
data["new"] = a
split = data["new"].apply(pd.Series)
temperature | humidity | pressure
0 20 30 180
1 30 90 120
Therefore i think your code should be:
df = pd.DataFrame.from_dict(df)
split = df['device_data'].apply(pd.Series)
df = pd.concat([df.drop(columns=['device_data']),split],axis=1)
df = pd.DataFrame.from_dict(df)
split = df['device_data'].apply(lambda x: pd.json_normalize(json.loads(x)))
a = df.drop(columns=['device_data'])
b = pd.concat(list(split), ignore_index=True)
df = a.join(b)
This is the code I am attempting to run. It works when I run a csv file but I am continuously streaming data from DyanmoDB and this code will not work and I receive the following error when attempting to do so.
If it already contains dict then it should be easier to work :
a = [{'temperature':20,'humidity':30,'pressure':180},{'temperature':30,'humidity':90,'pressure':120}]
data = pd.DataFrame()
data["new"] = a
split = data["new"].apply(pd.Series)
temperature | humidity | pressure
0 20 30 180
1 30 90 120
Therefore i think your code should be:
df = pd.DataFrame.from_dict(df)
split = df['device_data'].apply(pd.Series)
df = pd.concat([df.drop(columns=['device_data']),split],axis=1)