Converting JSON into newline delimited JSON in Python

Question:

My goal is to convert JSON file into a format that can uploaded from Cloud Storage into BigQuery (as described here) with Python.

I have tried using newlineJSON package for the conversion but receives the following error.

JSONDecodeError: Expecting value or ']': line 2 column 1 (char 5)

Does anyone have the solution to this?

Here is the sample JSON code:

[{
    "key01": "value01",
    "key02": "value02",
    ...
    "keyN": "valueN"
},
{
    "key01": "value01",
    "key02": "value02",
    ...
    "keyN": "valueN"
},
{
    "key01": "value01",
    "key02": "value02",
    ...
    "keyN": "valueN"
}
]

And here’s the existing python script:

with nlj.open(url_samplejson, json_lib = "simplejson") as src_:
    with nlj.open(url_convertedjson, "w") as dst_:
        for line_ in src_:
            dst_.write(line_)
Asked By: Fxs7576

||

Answers:

If you are willing to get out of Python, use jq:

$ cat a.json 
[{
    "key01": "value01",
    "key02": "value02",
    "keyN": "valueN"
},
{
    "key01": "value01",
    "key02": "value02",
    "keyN": "valueN"
},
{
    "key01": "value01",
    "key02": "value02",
    "keyN": "valueN"
}
]


$ cat a.json | jq -c '.[]'
{"key01":"value01","key02":"value02","keyN":"valueN"}
{"key01":"value01","key02":"value02","keyN":"valueN"}
{"key01":"value01","key02":"value02","keyN":"valueN"}

The iterator I used is '.[]' to go through the array, and -c puts each JSON object on a single line.

Resources:

Answered By: Felipe Hoffa

The answer with jq is really useful, but if you still want to do it with Python (as it seems from the question), you can do it with built-in json module.

import json
from io import StringIO
in_json = StringIO("""[{
    "key01": "value01",
    "key02": "value02",

    "keyN": "valueN"
},
{
    "key01": "value01",
    "key02": "value02",

    "keyN": "valueN"
},
{
    "key01": "value01",
    "key02": "value02",

    "keyN": "valueN"
}
]""")

result = [json.dumps(record) for record in json.load(in_json)]  # the only significant line to convert the JSON to the desired format

print('n'.join(result))

{"key01": "value01", "key02": "value02", "keyN": "valueN"}
{"key01": "value01", "key02": "value02", "keyN": "valueN"}
{"key01": "value01", "key02": "value02", "keyN": "valueN"}

* I’m using StringIO and print here just to make a sample easier to test locally.

As an alternative, you can use Python jq binding to combine it with the other answer.

Answered By: Oleh Rybalchenko

This takes a JSON file and converts into ND-JSON file.

import json

with open("results-20190312-113458.json", "r") as read_file:
    data = json.load(read_file)
result = [json.dumps(record) for record in data]
with open('nd-proceesed.json', 'w') as obj:
    for i in result:
        obj.write(i+'n')

Hope this helps someone.

Answered By: Saurav Joshi
with open('out.json', 'w') as f:
  for obj in objs:
    json.dump(obj, f)
    f.write('n')
Answered By: Dan Burton