How to create a json file based on CSV with no headers

Question:

I have a csv file like this:

'3', '8948', 'f678'
'3', '5654', 'f644'
'6', '5567', 'g3335'
'9', '4467', 'g3356'
'9', '7666', 'h4433'

The CSV holds various records. The first column represents an ID field.

I have looped through the CSV file and added the rows to a list.

I have then used that list to make a JSON file. Which looks like this:

[
    [
        "3",
        "8948",
        "f678"
    ],
    [
        "3",
        "5654",
        "f644"
    ],
    [
        "6",
        "5567",
        "g3335"
    ]
     ...

But as I understand it, I wont be able to read from this JSON and perform tasks on it? From what I can see I need it to be a dictionary, but how can I make a dictionary from my CSV, especially since the ID field is repeated and wont be unique. The only other option is to just use a row number, if this is correct – how do I create a dictionary from my CSV with a row number?

Asked By: sr546

||

Answers:

I’m guessing your code so far looks something like this:

import csv
import json

data: list[list[str]] = []
with open("input.csv", newline="") as f_in:
    reader = csv.reader(f_in)
    for row in reader:
        data.append(row)

with open("data.json", "w") as f_out:
    json.dump(data, f_out, indent=2)

To address your first issue/concern about how valid this JSON is or isn’t…

The bigger concept to take away is that Python’s json module produces valid JSON. If the module didn’t complain about something while you were dumping the data then the JSON is good.

But to more directly address your concern, JSON can look like a lot of different things:

print(json.dumps(1))
print(json.dumps("A"))
print(json.dumps({}))
print(json.dumps([]))

Each one of those dumps() produces valid JSON. I don’t know how to formally prove those are valid, but I do trust tools like Python’s json module (it’s been vetted over many years of real-world use and probably the world over). I also went to https://jsonlint.com/ and entered those simple examples directly and got "Valid JSON" for all.

Now, what to do about the JSON you have?

You can process it the way it is, or you can create the structure you want by providing column names/keys yourself (assuming you know what the data represents):

data_keyed: list[dict[str, Any]] = []
with open("input.csv", newline="") as f_in:
    reader = csv.reader(f_in)
    for row in reader:
        data_row = {"Col1": row[0], "Col2": row[1], "Col3": row[2]}
        data_keyed.append(data_row)

with open("data_keyed.json", "w") as f_out:
    json.dump(data_keyed, f_out, indent=2)

and now we get:

[
  {
    "Col1": "3",
    "Col2": "8948",
    "Col3": "f678"
  },
  {
    "Col1": "3",
    "Col2": "5654",
    "Col3": "f644"
  },
  ...
Answered By: Zach Young

No third party libraries solution:

import json
from csv import DictReader

with open("tmp/1.csv", "r") as f_in, open("tmp/result.json", "w") as f_out:
    dict_reader = DictReader(f_in, fieldnames=["field1", "field2", "field3"])
    json.dump(list(dict_reader), f_out)

However, if this isn’t going to be the only transform to be applied to this table data, there’s Table helper in convtools library (docs | github).

import json
from convtools.contrib.tables import Table

results = list(
    Table.from_csv(
        "tmp/1.csv", header=["field1", "field2", "field3"]
    ).into_iter_rows(dict)
)

with open("tmp/result.json", "w") as f:
    json.dump(results, f)
Answered By: westandskif
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.