Create array of json objects using for loops

Question:

I’m attempting to extract values from an html and then convert them into a json array, and so far I have been able to get what I want, but only as separate strings:

I did two for loops:

for line in games_html.findAll('div', class_="product_score"):
    score = "{'Score': %s}" % line.getText(strip=True)
    print(score)

for line in games_html.findAll('a'):
    title = "{'Title': '%s'}" % line.getText(strip=True)
    print(title)

Which produce these two outputs:

{'Title': 'Uncanny Valley'}
{'Title': 'Subject 13'}
{'Title': '2Dark'}
{'Title': 'Lethal VR'}
{'Title': 'Earthlock: Festival of Magic'}
{'Title': 'Knee Deep'}
{'Title': 'VR Ping Pong'}

and

{'Score': 73}
{'Score': 73}
{'Score': 72}
{'Score': 72}
{'Score': 72}
{'Score': 71}
{'Score': 71}

(they are longer but you can get an idea with this…)

How can I use python to create a json array out of these that would look like:

[{'Title': 'Uncanny Valley', 'Score': 73}, {....}]

I am gonna use the resulting array to do other things afterwards….

Do I need to store the items from the loop into lists and then merge them? Could you please illustrate an example given my scenario?

Asked By: geekiechic

||

Answers:

You need to maintain two lists for scores and titles and append all the data to those lists, instead of printing, and then zip those lists along with list comprehension to get the desired output as :

import json
scores, titles = [], []
for line in games_html.findAll('div', class_="product_score"):
    scores.append(line.getText(strip=True))

for line in games_html.findAll('a'):
    titles.append(line.getText(strip=True))

score_titles = [{"Title": t, "Score": s} for t, s in zip(titles, scores)]
print score_titles
# Printing in JSON format
print json.dumps(score_titles)
Answered By: ZdaR

As ZdaR’s post illustrates, to create a json, you need to build the corresponding Python data structure (lists for json arrays, dictionaries for json objects) and serialize it at the end. So the question is almost the same as how to create a list in a loop, because after creating the list, what remains is serialization which is as simple as json.loads(data).

The task in the OP can be done in two loops:

data = [{'Title': line.getText(strip=True)} for line in games_html.findAll('a')]

for i, line in enumerate(games_html.findAll('div', class_="product_score")):
    data[i]['Score'] = line.getText(strip=True)

# serialize to json array
j = json.dumps(data)

# or write to a file
with open('data.json', 'w') as f:
    json.dump(data, f)
Answered By: cottontail
Categories: questions Tags: , , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.