Recursively find and return key and value from nested dictionaries python

Question:

Could someone help me with my code below?

This is originally meant to work with data in a json file but I have converted it to work with a json / dictionary variable.

Right now the get_data_value() function is working but instead of just returning the value, I would like to return a singular dict containing the key and value.

I am just not sure how to convert the item_generator function to make this possible without ruining the recursion; I found this function from an example here on stackoverflow.

def get_data_value(data,data_name):
    d = data['test']
    print(item_generator(d,data_name))
    for _ in item_generator(d,data_name):
        return (_)
    
def item_generator(json_input, lookup_key):
    if isinstance(json_input, dict):
        for key, value in json_input.items():
            if key == lookup_key:  
                data_single_item = {key:value}  # what i want to return
                print(data_single_item)       
                
                yield value                    # only value is returned
            else:
                yield from item_generator(value, lookup_key)
    elif isinstance(json_input, list):
        for item in json_input:
            yield from item_generator(item, lookup_key)
            

json_data = { "test": [ { "Tier1": [ { "Tier1-Main-Title-1": [ { "title": "main", "example": 400 } ] } ] }, { "Tier2": [] }, { "Tier3": [ { "Example1-Sub1": 44 } ] } ] }

print(get_data_value(json_data,'title'))
Asked By: John Anthony

||

Answers:

I am not sure if I am missing something but why dont you just return what you want? Like this for example:

def get_data_value(data,data_name):
    d = data['test']
    print(item_generator(d,data_name))
    for _ in item_generator(d,data_name):
        return (_)

def item_generator(json_input, lookup_key):
    if isinstance(json_input, dict):
        for key, value in json_input.items():
            if key == lookup_key:      
                yield {key:value} 
            else:
                yield from item_generator(value, lookup_key)
    elif isinstance(json_input, list):
        for item in json_input:
            yield from item_generator(item, lookup_key)


json_data = { "test": [ { "Tier1": [ { "Tier1-Main-Title-1": [ { "title": "main", "example": 400 } ] } ] }, { "Tier2": [] }, { "Tier3": [ { "Example1-Sub1": 44 } ] } ] }

print(get_data_value(json_data,'title'))

Also if you have multiple instances of "title" in different sub-objects and you want all of them back maybe in a list this can also work:

def get_data_value(data, data_name):
    d = data["test"]
    results = []
    for item in item_generator(d, data_name):
        results.append(item)
    return results


def item_generator(json_input, lookup_key):
    if isinstance(json_input, dict):
        for key, value in json_input.items():
            if key == lookup_key:
                yield {key: value}
            else:
                yield from item_generator(value, lookup_key)
    elif isinstance(json_input, list):
        for item in json_input:
            yield from item_generator(item, lookup_key)


json_data = {
    "test": [
        {
            "Tier1": [
                {
                    "Tier1-Main-Title-1": [
                        {"title": "main", "example": 400},
                        {"title": "example2"},
                    ]
                }
            ]
        },
        {"Tier2": []},
        {
            "Tier3": [
                {"Example1-Sub1": 44},
                {"AnotherExample": 9856, "title": "example3"},
            ]
        },
    ]
}

print(get_data_value(json_data, "title"))

This returns: [{'title': 'main'}, {'title': 'example2'}, {'title': 'example3'}]

Answered By: KZiovas

It’s worth pointing out you have a bug here:

for _ in item_generator(d,data_name):
    return (_)

This is an important case to be aware of, because the return statement here only returns once. Therefore, this for loop only runs for the first iteration, and only returns the first yield result – i.e. only the first occurrence of the lookup key in the json_data.

You can fix it using generator (or iterable) unpacking into a list, as below:

def get_data_value(data, data_name):
    d = data['test']
    return [*item_generator(d, data_name)]


def item_generator(json_input, lookup_key):
    if isinstance(json_input, dict):
        if lookup_key in json_input:
            yield {lookup_key: json_input[lookup_key]}
        else:
            for v in json_input.values():
                yield from item_generator(v, lookup_key)

    elif isinstance(json_input, list):
        for item in json_input:
            yield from item_generator(item, lookup_key)


json_data = {"test": [{"Tier1": [{"Tier1-Main-Title-1": [{"title": "main", "example": 400}]}]}, {"Tier2": []},
                      {"Tier3": [{"Example1-Sub1": 44, "title": "TEST2"}]}]}

print(get_data_value(json_data, 'title'))

Result:

[{'title': 'main'}, {'title': 'TEST2'}]

Or, if you’d prefer not to call get_data_value at all:

print(*item_generator(json_data['test'], 'title'))

Where passing the key 'test' is optional, thanks to the function being recursive by nature.

The results are separated by a single space by default, but you can control the separator by passing the sep parameter to the print statement.

{'title': 'main'} {'title': 'TEST2'}
Answered By: rv.kvetch