Exploding specific type of json, into columns pandas

Question:

So i have the following sample json, contained into a series of mine

s = pd.Series(['{"city":"Uberlândia","bot-origin":null,"campaign-source":"carrinho-abandonado-ecommerce-sms","lastState":"productAvailabilityCepInDatabaseEqualTrue","main-installation-date":null,"userid":"[email protected]","full-name":null,"alternative-installation-date":null,"chosen-product":"Internet","bank":null,"postalcode":"38405328","due-date":null,"cpf":"01548226041","origin-link":"","payment":null,"state":"MG","api-orders-hash-id":null,"email":null,"api-orders-error":null,"plan-name":null,"userphone":"34 9342-8011","plan-offer":null,"completed-address":"38405328 - R IGUACU, 1289 - UMUARAMA - null - Uberlândia - MG","type-of-person":"CPF","onboarding-simplified":null,"type-of-product":"Residencial","main-installation-period-day":null,"plan-value":null,"alternative-installation-period-day":null,"data-change":"false"}',
       '{"city":"Uberlândia","bot-origin":null,"campaign-source":"carrinho-abandonado-ecommerce-sms","lastState":"productAvailabilityCepInDatabaseEqualTrue","main-installation-date":null,"userid":"[email protected]","full-name":null,"alternative-installation-date":null,"chosen-product":"Internet","bank":null,"postalcode":"38405328","due-date":null,"cpf":"01548226041","origin-link":"","payment":null,"state":"MG","api-orders-hash-id":null,"email":null,"api-orders-error":null,"plan-name":null,"userphone":"34 9342-8011","plan-offer":null,"completed-address":"38405328 - R IGUACU, 1289 - UMUARAMA - null - Uberlândia - MG","type-of-person":"CPF","onboarding-simplified":null,"type-of-product":"Residencial","main-installation-period-day":null,"plan-value":null,"alternative-installation-period-day":null,"data-change":"false"}',
       '{"city":"Uberlândia","bot-origin":null,"campaign-source":"carrinho-abandonado-ecommerce-sms","lastState":"productAvailabilityAddressConfirmation","main-installation-date":null,"userid":"[email protected]","full-name":null,"alternative-installation-date":null,"chosen-product":"Internet","bank":null,"postalcode":"38405328","due-date":null,"cpf":"01548226041","origin-link":"","payment":null,"state":"MG","api-orders-hash-id":null,"email":null,"api-orders-error":null,"plan-name":null,"userphone":"34 9342-8011","plan-offer":null,"completed-address":"38405328 - R IGUACU, 1289 - UMUARAMA - null - Uberlândia - MG","type-of-person":"CPF","onboarding-simplified":null,"type-of-product":"Residencial","main-installation-period-day":null,"plan-value":null,"alternative-installation-period-day":null,"data-change":"false"}'])

I dont actually know a way of exploding this type of json, into columns, would appreciate some help? I tried json_normalize loads and so on but i get empty results.

Wanted result would be something like this:

df = pd.DataFrame({'city':['Uberlandia','Uberlandia','Uberlandia'],'bot-origin':[null,null,null]}) # There are more columns but you get the jist.

Since there were a lot of answers would appreciate if someone showed me the most time efficient way, have lots of rows.

Asked By: INGl0R1AM0R1

||

Answers:

The following should work

import json 

s = pd.Series(['{"city":"Uberlândia","bot-origin":null,"campaign-source":"carrinho-abandonado-ecommerce-sms","lastState":"productAvailabilityCepInDatabaseEqualTrue","main-installation-date":null,"userid":"[email protected]","full-name":null,"alternative-installation-date":null,"chosen-product":"Internet","bank":null,"postalcode":"38405328","due-date":null,"cpf":"01548226041","origin-link":"","payment":null,"state":"MG","api-orders-hash-id":null,"email":null,"api-orders-error":null,"plan-name":null,"userphone":"34 9342-8011","plan-offer":null,"completed-address":"38405328 - R IGUACU, 1289 - UMUARAMA - null - Uberlândia - MG","type-of-person":"CPF","onboarding-simplified":null,"type-of-product":"Residencial","main-installation-period-day":null,"plan-value":null,"alternative-installation-period-day":null,"data-change":"false"}',
       '{"city":"Uberlândia","bot-origin":null,"campaign-source":"carrinho-abandonado-ecommerce-sms","lastState":"productAvailabilityCepInDatabaseEqualTrue","main-installation-date":null,"userid":"[email protected]","full-name":null,"alternative-installation-date":null,"chosen-product":"Internet","bank":null,"postalcode":"38405328","due-date":null,"cpf":"01548226041","origin-link":"","payment":null,"state":"MG","api-orders-hash-id":null,"email":null,"api-orders-error":null,"plan-name":null,"userphone":"34 9342-8011","plan-offer":null,"completed-address":"38405328 - R IGUACU, 1289 - UMUARAMA - null - Uberlândia - MG","type-of-person":"CPF","onboarding-simplified":null,"type-of-product":"Residencial","main-installation-period-day":null,"plan-value":null,"alternative-installation-period-day":null,"data-change":"false"}',
       '{"city":"Uberlândia","bot-origin":null,"campaign-source":"carrinho-abandonado-ecommerce-sms","lastState":"productAvailabilityAddressConfirmation","main-installation-date":null,"userid":"[email protected]","full-name":null,"alternative-installation-date":null,"chosen-product":"Internet","bank":null,"postalcode":"38405328","due-date":null,"cpf":"01548226041","origin-link":"","payment":null,"state":"MG","api-orders-hash-id":null,"email":null,"api-orders-error":null,"plan-name":null,"userphone":"34 9342-8011","plan-offer":null,"completed-address":"38405328 - R IGUACU, 1289 - UMUARAMA - null - Uberlândia - MG","type-of-person":"CPF","onboarding-simplified":null,"type-of-product":"Residencial","main-installation-period-day":null,"plan-value":null,"alternative-installation-period-day":null,"data-change":"false"}'])

df = s.apply(lambda row: pd.Series(json.loads(row)))

Output:

city bot-origin campaign-source lastState main-installation-date userid full-name alternative-installation-date chosen-product bank postalcode due-date cpf origin-link payment state api-orders-hash-id email api-orders-error plan-name userphone plan-offer completed-address type-of-person onboarding-simplified type-of-product main-installation-period-day plan-value alternative-installation-period-day data-change
0 Uberlândia carrinho-abandonado-ecommerce-sms productAvailabilityCepInDatabaseEqualTrue [email protected] Internet 38405328 01548226041 MG 34 9342-8011 38405328 – R IGUACU, 1289 – UMUARAMA – null – Uberlândia – MG CPF Residencial false
1 Uberlândia carrinho-abandonado-ecommerce-sms productAvailabilityCepInDatabaseEqualTrue [email protected] Internet 38405328 01548226041 MG 34 9342-8011 38405328 – R IGUACU, 1289 – UMUARAMA – null – Uberlândia – MG CPF Residencial false
2 Uberlândia carrinho-abandonado-ecommerce-sms productAvailabilityAddressConfirmation [email protected] Internet 38405328 01548226041 MG 34 9342-8011 38405328 – R IGUACU, 1289 – UMUARAMA – null – Uberlândia – MG CPF Residencial false

The null values are converted to None. If you want to convert them to NaN you can use .fillna(np.nan):

import json
import numpy as np

df = (
    s.apply(lambda row: pd.Series(json.loads(row)))
     .fillna(np.nan)
)
Answered By: Rodalm

Try this:

df = pd.DataFrame.from_records(s.map(json.loads))
print(df)

only printed a part of the whole df for presentation.

         city bot-origin                    campaign-source                                  lastState main-installation-date
0  Uberlândia       None  carrinho-abandonado-ecommerce-sms  productAvailabilityCepInDatabaseEqualTrue                   None
1  Uberlândia       None  carrinho-abandonado-ecommerce-sms  productAvailabilityCepInDatabaseEqualTrue                   None
2  Uberlândia       None  carrinho-abandonado-ecommerce-sms     productAvailabilityAddressConfirmation                   None
Answered By: Rabinzel

Looks like you have string formatted json objects, first we need to convert them then pass them into a json_normalize method.

import json 

df1 = pd.json_normalize(s.map(json.loads))


         city bot-origin                    campaign-source                                  lastState  ... main-installation-period-day plan-value alternative-installation-period-day data-change
0  Uberlândia       None  carrinho-abandonado-ecommerce-sms  productAvailabilityCepInDatabaseEqualTrue  ...                         None       None                                None       false
1  Uberlândia       None  carrinho-abandonado-ecommerce-sms  productAvailabilityCepInDatabaseEqualTrue  ...                         None       None                                None       false
2  Uberlândia       None  carrinho-abandonado-ecommerce-sms     productAvailabilityAddressConfirmation  ...                         None       None                                None       false
Answered By: Umar.H
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.