create a dataframe from multiple JSON file with unique keys

Question:

I have a JSON that looks something like this:

translation_map:    
    str_empty:  
        nl: {}
        bn: {}
    str_6df066da34e6:   
        nl: 
            value:  "value 1"
            publishedAt:    16438
            audio:  "value1474.mp3"
        bn: 
            value:  "value2"
            publishedAt:    164322907
    str_9036dfe313457:  
        nl: 
            value:  "value3"
            publishedAt:    1647611912
            audio:  "value3615.mp3"
        bn: 
            value:  "value4"
            publishedAt:    1238641456

I am trying to take some of the fields and put them into a dataframe that I can later export to a CSV, however I am having trouble with the unique keys
I have this code which works for one unique value:

import os, json
import pandas as pd

# json files
path_to_json = 'C:\Users\bob\Videos\Captures'
json_files = [pos_json for pos_json in os.listdir(path_to_json) if pos_json.endswith('.json')]
print(json_files)

# define my pandas Dataframe columns
jsons_data = pd.DataFrame(columns=['transcription', 'meaning', 'sound'])

for index, js in enumerate(json_files):
    with open(os.path.join(path_to_json, js)) as json_file:
        json_text = json.load(json_file)

        transcription = json_text['translation_map']['str_6df066da34e6']['nl']['value']
        sound = json_text['translation_map']['str_6df066da34e6']['nl']['audio']
        meaning = json_text['translation_map']['str_6df066da34e6']['bn']['value']

        jsons_data.loc[index] = [transcription, meaning, sound]

# look at json data in our DataFrame
print(jsons_data)

However, I am not sure how to loop through the unique values with this.

Asked By: sourlemonaid

||

Answers:

Use a nested loop and dict.values() like so:

json_text = {
    "translation_map": {
        "str_9asihdu7dcb": {
            "nl": {
                "value": "value2",
                "audio": "8007.mp3"
            },
            "bn": {
                "value": "value4"
            }
        },
        "str_f4c8ashuh524": {
            "nl": {
                "value": "value1",
                "audio": "8026.mp3"
            },
            "bn": {
                "value": "Maet."
            }
        },
        "str_39asjashfk6": {
            "nl": {
                "value": "value5",
                "audio": "40.mp3"
            },
            "bn": {
                "value": "value4"
            }
        }
    }
}

for translation_map in json_text:
    for v in json_text[translation_map].values():
        if v["nl"]:
            transcription = v["nl"]["value"]
            sound = v["nl"]["audio"]
        else:
            transcription = "empty"
            sound = "empty"

        if v["bn"]:
            meaning = v["bn"]["value"]
        else:
            meaning = "empty"

        print(transcription, sound, meaning)

Output

value2 8007.mp3 value4
value1 8026.mp3 Maet.
value5 40.mp3 value4
empty empty empty
Answered By: puncher
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.