Converting JSON into newline delimited JSON in Python

Question

My goal is to convert JSON file into a format that can uploaded from Cloud Storage into BigQuery (as described here) with Python.

I have tried using newlineJSON package for the conversion but receives the following error.

JSONDecodeError: Expecting value or ']': line 2 column 1 (char 5)

Does anyone have the solution to this?

Here is the sample JSON code:

[{
    "key01": "value01",
    "key02": "value02",
    ...
    "keyN": "valueN"
},
{
    "key01": "value01",
    "key02": "value02",
    ...
    "keyN": "valueN"
},
{
    "key01": "value01",
    "key02": "value02",
    ...
    "keyN": "valueN"
}
]

And here’s the existing python script:

with nlj.open(url_samplejson, json_lib = "simplejson") as src_:
    with nlj.open(url_convertedjson, "w") as dst_:
        for line_ in src_:
            dst_.write(line_)

Asked By: Fxs7576

||

Source

Answer 1

If you are willing to get out of Python, use jq:

$ cat a.json 
[{
    "key01": "value01",
    "key02": "value02",
    "keyN": "valueN"
},
{
    "key01": "value01",
    "key02": "value02",
    "keyN": "valueN"
},
{
    "key01": "value01",
    "key02": "value02",
    "keyN": "valueN"
}
]


$ cat a.json | jq -c '.[]'
{"key01":"value01","key02":"value02","keyN":"valueN"}
{"key01":"value01","key02":"value02","keyN":"valueN"}
{"key01":"value01","key02":"value02","keyN":"valueN"}

The iterator I used is '.[]' to go through the array, and -c puts each JSON object on a single line.

Resources:

Answered By: Felipe Hoffa

Answer 2

The answer with jq is really useful, but if you still want to do it with Python (as it seems from the question), you can do it with built-in json module.

import json
from io import StringIO
in_json = StringIO("""[{
    "key01": "value01",
    "key02": "value02",

    "keyN": "valueN"
},
{
    "key01": "value01",
    "key02": "value02",

    "keyN": "valueN"
},
{
    "key01": "value01",
    "key02": "value02",

    "keyN": "valueN"
}
]""")

result = [json.dumps(record) for record in json.load(in_json)]  # the only significant line to convert the JSON to the desired format

print('n'.join(result))

{"key01": "value01", "key02": "value02", "keyN": "valueN"}
{"key01": "value01", "key02": "value02", "keyN": "valueN"}
{"key01": "value01", "key02": "value02", "keyN": "valueN"}

* I’m using StringIO and print here just to make a sample easier to test locally.

As an alternative, you can use Python jq binding to combine it with the other answer.

Answered By: Oleh Rybalchenko

Answer 3

This takes a JSON file and converts into ND-JSON file.

import json

with open("results-20190312-113458.json", "r") as read_file:
    data = json.load(read_file)
result = [json.dumps(record) for record in data]
with open('nd-proceesed.json', 'w') as obj:
    for i in result:
        obj.write(i+'n')

Hope this helps someone.

Answered By: Saurav Joshi

Answer 4

with open('out.json', 'w') as f:
  for obj in objs:
    json.dump(obj, f)
    f.write('n')

Answered By: Dan Burton

Converting JSON into newline delimited JSON in Python

Question:

Answers: