How to convert CSV to nested JSON in Python

Question:

I have a csv file in the following format:

a b c d e
1 2 3 4 5
9 8 7 6 5

I want to convert this csv file to Nested JSON format, like this:

[{"a": 1,
"Purchase" : {
              "b": 2,
              "c": 3
              "d": 4},
"Sales": {
           "d": 4,
           "e": 5}},
{"a": 9,
"Purchase" : {
              "b": 8,
              "c": 7},
"Sales": {
           "d": 6,
           "e": 5}}]

How can I make this transformation? I can’t seem to figure out how to make this transformation in Python.
Keep in mind this is only sample table, my real table has multiple columns and thousands on rows, so manual operations are not economical.

Till now I have tried this code:

with open("new_data.csv") as f:
    reader = csv.DictReader(f)
    for r in reader:
        r["purchase"] = {"b": r['b'],
                        "c": r['c'],
                        }

Here I am trying unsuccessfully to add another key value pair of my required dictionary, but not successfully. Same thing I would have done with Sales also but this is just sample.

Asked By: Bhavya Budhia

||

Answers:

A simple way is to add more columns; then use to_json method in pandas:

import pandas as pd
df = pd.read_csv('your_file.csv')
df['Purchase'] = df[['b','c','d']].to_dict('records')
df['Sales'] = df[['d','e']].to_dict('records')
out = df[['a', 'Purchase', 'Sales']].to_json(orient='records', indent=4)

Output:

[
    {
        "a":1,
        "Purchase":{
            "b":2,
            "c":3,
            "d":4
        },
        "Sales":{
            "d":4,
            "e":5
        }
    },
    {
        "a":9,
        "Purchase":{
            "b":8,
            "c":7,
            "d":6
        },
        "Sales":{
            "d":6,
            "e":5
        }
    }
]
Answered By: user7864386

You don’t need any libraries for this, just specify the right dialect, e.g. for tab-separated:

import csv
import json


with open("tmp4.csv", "r") as f:
    result = [
        {
            "a": row["a"],
            "Purchase": {
                "b": row["b"],
                "c": row["c"],
            },
            "Sales": {
                "d": row["d"],
                "e": row["e"],
            },
        }
        for row in csv.DictReader(f, dialect='excel-tab')
    ]
assert (
    json.dumps(result)
    == '[{"a": "1", "Purchase": {"b": "2", "c": "3"}, "Sales": {"d": "4", "e": "5"}}, {"a": "9", "Purchase": {"b": "8", "c": "7"}, "Sales": {"d": "6", "e": "5"}}]'
)
Answered By: westandskif

When you do r["purchase"] = {"b": ...}, you’re assigning the dictionary back to per-line object r which gets discarded at the end of the loop. Instead, create a new dictionary per record and append that to a list. Like:

result = []
with open("new_data.csv") as f:
    reader = csv.DictReader(f)
    for r in reader:
        result.append({
            "a": r["a"],
            "Purchase" : {
                "b": r["b"],
                "c": r["c"],
                "d": r["d"],
            },
            "Sales": {
                "d": r["d"],
                "e": r["e"],
            },
        })

And to use a list comprehension to create result:

with open("new_data.csv") as f:
    reader = csv.DictReader(f)
    result = [{
        "a": r["a"],
        "Purchase" : {
            "b": r["b"],
            "c": r["c"],
            "d": r["d"],
        },
        "Sales": {
            "d": r["d"],
            "e": r["e"],
        },
    } for r in reader]
Answered By: aneroid

From this line
df[‘Purchase’] = df[[‘b’,’c’,’d’]].to_dict(‘records’)
df[‘Sales’] = df[[‘d’,’e’]].to_dict(‘records’)
out = df[[‘a’, ‘Purchase’, ‘Sales’]].to_json(orient=’records’, indent=4), how do we create an array in json

Currently this code generates json structure like
Purchase: { ‘b’, ‘c’, ‘d’}
Sales: {‘d’, ‘e’}
In case I want an output like
"Purchase":[{"b":1,"c":{"test":"abc"}, "d":{"testing":"def"}}], then what changes should I do to the above code?? Kindly advise

Answered By: Jalaja Muthuraj
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.