Create array of json objects using for loops
Question:
I’m attempting to extract values from an html and then convert them into a json array, and so far I have been able to get what I want, but only as separate strings:
I did two for loops:
for line in games_html.findAll('div', class_="product_score"):
score = "{'Score': %s}" % line.getText(strip=True)
print(score)
for line in games_html.findAll('a'):
title = "{'Title': '%s'}" % line.getText(strip=True)
print(title)
Which produce these two outputs:
{'Title': 'Uncanny Valley'}
{'Title': 'Subject 13'}
{'Title': '2Dark'}
{'Title': 'Lethal VR'}
{'Title': 'Earthlock: Festival of Magic'}
{'Title': 'Knee Deep'}
{'Title': 'VR Ping Pong'}
and
{'Score': 73}
{'Score': 73}
{'Score': 72}
{'Score': 72}
{'Score': 72}
{'Score': 71}
{'Score': 71}
(they are longer but you can get an idea with this…)
How can I use python to create a json array out of these that would look like:
[{'Title': 'Uncanny Valley', 'Score': 73}, {....}]
I am gonna use the resulting array to do other things afterwards….
Do I need to store the items from the loop into lists and then merge them? Could you please illustrate an example given my scenario?
Answers:
You need to maintain two lists for scores and titles and append all the data to those lists, instead of printing, and then zip
those lists along with list comprehension to get the desired output as :
import json
scores, titles = [], []
for line in games_html.findAll('div', class_="product_score"):
scores.append(line.getText(strip=True))
for line in games_html.findAll('a'):
titles.append(line.getText(strip=True))
score_titles = [{"Title": t, "Score": s} for t, s in zip(titles, scores)]
print score_titles
# Printing in JSON format
print json.dumps(score_titles)
As ZdaR’s post illustrates, to create a json, you need to build the corresponding Python data structure (lists for json arrays, dictionaries for json objects) and serialize it at the end. So the question is almost the same as how to create a list in a loop, because after creating the list, what remains is serialization which is as simple as json.loads(data)
.
The task in the OP can be done in two loops:
data = [{'Title': line.getText(strip=True)} for line in games_html.findAll('a')]
for i, line in enumerate(games_html.findAll('div', class_="product_score")):
data[i]['Score'] = line.getText(strip=True)
# serialize to json array
j = json.dumps(data)
# or write to a file
with open('data.json', 'w') as f:
json.dump(data, f)
I’m attempting to extract values from an html and then convert them into a json array, and so far I have been able to get what I want, but only as separate strings:
I did two for loops:
for line in games_html.findAll('div', class_="product_score"):
score = "{'Score': %s}" % line.getText(strip=True)
print(score)
for line in games_html.findAll('a'):
title = "{'Title': '%s'}" % line.getText(strip=True)
print(title)
Which produce these two outputs:
{'Title': 'Uncanny Valley'}
{'Title': 'Subject 13'}
{'Title': '2Dark'}
{'Title': 'Lethal VR'}
{'Title': 'Earthlock: Festival of Magic'}
{'Title': 'Knee Deep'}
{'Title': 'VR Ping Pong'}
and
{'Score': 73}
{'Score': 73}
{'Score': 72}
{'Score': 72}
{'Score': 72}
{'Score': 71}
{'Score': 71}
(they are longer but you can get an idea with this…)
How can I use python to create a json array out of these that would look like:
[{'Title': 'Uncanny Valley', 'Score': 73}, {....}]
I am gonna use the resulting array to do other things afterwards….
Do I need to store the items from the loop into lists and then merge them? Could you please illustrate an example given my scenario?
You need to maintain two lists for scores and titles and append all the data to those lists, instead of printing, and then zip
those lists along with list comprehension to get the desired output as :
import json
scores, titles = [], []
for line in games_html.findAll('div', class_="product_score"):
scores.append(line.getText(strip=True))
for line in games_html.findAll('a'):
titles.append(line.getText(strip=True))
score_titles = [{"Title": t, "Score": s} for t, s in zip(titles, scores)]
print score_titles
# Printing in JSON format
print json.dumps(score_titles)
As ZdaR’s post illustrates, to create a json, you need to build the corresponding Python data structure (lists for json arrays, dictionaries for json objects) and serialize it at the end. So the question is almost the same as how to create a list in a loop, because after creating the list, what remains is serialization which is as simple as json.loads(data)
.
The task in the OP can be done in two loops:
data = [{'Title': line.getText(strip=True)} for line in games_html.findAll('a')]
for i, line in enumerate(games_html.findAll('div', class_="product_score")):
data[i]['Score'] = line.getText(strip=True)
# serialize to json array
j = json.dumps(data)
# or write to a file
with open('data.json', 'w') as f:
json.dump(data, f)