How to convert JSON data inside a pandas column into new columns

Question:

I have this short version of ADSB json data and would like to convert it into dataFrame columns as Icao, Alt, Lat, Long, Spd, Cou…..

After Alperen told me to do this

df = pd.read_json('2016-06-20-2359Z.json', lines=True), 

I can load it into a DataFrame. However, df.acList is

[{'Id': 10537990, 'Rcvr': 1, 'HasSig': False, ...
Name: acList, dtype: object

How can I get the Icao, Alt, Lat, Long, Spd, Cou data?

 "src":1,
   "feeds":[  
      {  
         "id":1,
         "name":"ADSBexchange.com",
         "polarPlot":false
      }
   ],
   "srcFeed":1,
   "showSil":true,
   "showFlg":true,
   "showPic":true,
   "flgH":20,
   "flgW":85,
   "acList":[  
      {  
         "Id":11281748,
         "Rcvr":1,
         "HasSig":false,
         "Icao":"AC2554",
         "Bad":false,
         "Reg":"N882AS",
         "FSeen":"/Date(1466467166951)/",
         "TSecs":3,
         "CMsgs":1,
         "AltT":0,
         "Tisb":false,
         "TrkH":false,
         "Type":"CRJ2",
         "Mdl":"2001 
BOMBARDIER INC 
 CL-600-2B19",
         "Man":"Bombardier",
         "CNum":"7503",
         "Op":"EXPRESSJET AIRLINES INC - ATLANTA, GA",
         "OpIcao":"ASQ",
         "Sqk":"",
         "VsiT":0,
         "WTC":2,
         "Species":1,
         "Engines":"2",
         "EngType":3,
         "EngMount":1,
         "Mil":false,
         "Cou":"United States",
         "HasPic":false,
         "Interested":false,
         "FlightsCount":0,
         "Gnd":false,
         "SpdTyp":0,
         "CallSus":false,
         "TT":"a",
         "Trt":1,
         "Year":"2001"
      },
      {  
         "Id":11402205,
         "Rcvr":1,
         "HasSig":true,
         "Sig":110,
         "Icao":"ADFBDD",
         "Bad":false,
         "FSeen":"/Date(1466391940977)/",
         "TSecs":75229,
         "CMsgs":35445,
         "Alt":8025,
         "GAlt":8025,
         "AltT":0,
         "Call":"TEST1234",
         "Tisb":false,
         "TrkH":false,
         "Sqk":"0262",
         "Help":false,
         "VsiT":0,
         "WTC":0,
         "Species":0,
         "EngType":0,
         "EngMount":0,
         "Mil":true,
         "Cou":"United States",
         "HasPic":false,
         "Interested":false,
         "FlightsCount":0,
         "Gnd":true,
         "SpdTyp":0,
         "CallSus":false,
         "TT":"a",
         "Trt":1
      }
   ],
   "totalAc":4231,
   "lastDv":"636019887431643594",
   "shtTrlSec":61,
   "stm":1466467170029
}
</pre>
Asked By: user8149657

||

Answers:

If you already have your data in acList column in a pandas DataFrame, simply do:

import pandas as pd
pd.io.json.json_normalize(df.acList[0])

Alt AltT    Bad CMsgs   CNum    Call    CallSus Cou EngMount    EngType ... Sqk TSecs   TT  Tisb    TrkH    Trt Type    VsiT    WTC Year
0   NaN 0   False   1   7503    NaN False   United States   1   3   ...     3   a   False   False   1   CRJ2    0   2   2001
1   8025.0  0   False   35445   NaN TEST1234    False   United States   0   0   ... 0262    75229   a   False   False   1   NaN 0   0   NaN

Since pandas 1.0 the imports should be:

import pandas as pd
pd.json_normalize(df.acList[0])
Answered By: Sergey Bushmanov

@Sergey’s answer solved the issue for me but I was running into issues because the json in my data frame column was kept as a string and not as an object. I had to add the additional step of mapping the column:

import json
import pandas as pd
pd.io.json.json_normalize(df.acList.apply(json.loads))
Answered By: ThinkBonobo

I can’t comment yet on ThinkBonobo’s answer but in case the JSON in the column isn’t exactly a dictionary you can keep doing .apply until it is. So in my case

import json
import pandas as pd

json_normalize(
    df
    .theColumnWithJson
    .apply(json.loads)
    .apply(lambda x: x[0]) # the inner JSON is list with the dictionary as the only item
)
Answered By: blockw

Since pandas 1.0, json_normalize is available in the top-level namespace.
Therefore use:

import pandas as pd
pd.json_normalize(df.acList[0])
Answered By: Jonne Kleijer

In my case I had some missing values (None) then I created a more specific code that also drops the original column after creating the new ones:

for prefix in ['column1', 'column2']:
    df_temp = df[prefix].apply(lambda x: {} if pd.isna(x) else x)
    df_temp = pd.io.json.json_normalize(df_temp)
    df_temp = df_temp.add_prefix(prefix + '_')
    df.drop([prefix], axis=1, inplace=True)
    df = pd.concat([df, df_temp], axis = 1, sort=False)
Answered By: Felipe Augusto
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.