Create dictionary using JSON data

Question:

I have a JSON file that has movie data in it. I want to create a dictionary that has the movie title as the key and a count of how many actors are in that movie as the value. An example from the JSON file is below:

    {
        "title": "Marie Antoinette",
        "year": "2006",
        "genre": "Drama",
        "summary": "Based on Antonia Fraser's book about the ill-fated Archduchess of Austria and later Queen of France, 'Marie Antoinette' tells the story of the most misunderstood and abused woman in history, from her birth in Imperial Austria to her later life in France.",
        "country": "USA",

        "director": {
            "last_name": "Coppola",
            "first_name": "Sofia",
            "birth_date": "1971"
        },
        "actors": [
            {
                "first_name": "Kirsten",
                "last_name": "Dunst",
                "birth_date": "1982",
                "role": "Marie Antoinette"
            },
            {
                "first_name": "Jason",
                "last_name": "Schwartzman",
                "birth_date": "1980",
                "role": "Louis XVI"
            }
        ]
    }

I have the following but it’s counting all of the actors from all of the movies instead of each movie and the number of actors per movie. I’m not sure how to do this correctly as I’m newer to Python so help would be great.

import json

def actor_count(json_data):
    with open("movies_db.json", 'r') as file:
        data = json.load(file)
        for t in data:
            title = [t['title'] for t in data]
            for element in data:
                for actor in element['actors']:
                    rolee = [actor['role'] for movie in data for actor in movie['actors']]
                    len_role = [len(role)]
            newD = dict(zip(title, len_role))
        print(newD)
                
json_data = open('movies_db.json')
actor_count(json_data)
Asked By: P0ffee

||

Answers:

def actor_count(json_data):
    newD = dict()
    with open("movies_db.json", 'r') as file:
        data = json.load(file)
        for t in data:
            if t == 'title':
                title_ = json_data[t]
                newD[ title_ ] = 0
            if t == 'actors':
                newD[ title_ ] = len(json_data[t])
        print(newD)

Output:

{'Marie Antoinette': 2}
Answered By: Will

You show json that only contains a dictionary, yet you seem to process it as if it were a list of dictionaries with the structure you have shown. Pending clarification, I am answering here as if the latter is true — you have a list of dictionaries, since you would be asking a different question about a different error if this was not the case.

In your function, each element of data is a dictionary that contains the information for a single movie. To get a dict correlating the title to the count of actors in this movie, you just need to access the "title" key and the length of the "actors" key for each element.

def actor_count(json_data):
    movie_actors = {}
    for movie in json_data:
        title = movie["title"]
        num_actors = len(movie["actors"])
        movie_actors[title] = num_actors

    return movie_actors

Alternatively, use a dictionary comprehension to build this dictionary:

def actor_count(json_data):
    movie_actors = {movie["title"]: len(movie["actors"]) movie in json_data}
    return movie_actors

Now, load your json file once, and use that when you call actors_count. This will return a dictionary mapping each movie title to the number of actors.

with open("movies_db.json", 'r') as file:
    data = json.load(file)

actors_count(data)

Note that loading the json file again in the function is unnecessary, since you already did it before calling the function, and are passing the parsed object to the function.


If you want to keep your current logic of using list comprehensions, and then zipping the resultant lists to create a dict, that is also possible although slightly less efficient. There are significant changes you will need to make:

def actor_count(json_data):
    title = [t['title'] for t in json_data]
    n_actors = [len(t['actors'] for t in json_data)]
    newD = dict(zip(title, n_actors))
    return newD
  1. As before, no need to read the file again in the function
  2. You’re already looping over all elements in json_data as part of the list comprehension, so no need for another loop outside this.
  3. You can get the number of actors simply by len(t['actors'])
  4. You seem to have misconceptions about how list comprehensions and loops work. A list comprehension is a self-contained loop that builds a list. If you have a list comprehension, there’s usually no need to surround it by the same for ... in ... statement that already exists in the comprehension.
Answered By: Pranav Hosangadi
Categories: questions Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.