How to find all occurrence of a key in nested dict, but also keep track of the outer dict key value?

Question:

I’ve searched over stackoverflow and found the following code that allow me to search for a key values in nested dict recursively. However, I also want to keep track of the outer dict’s key value. How should I do that?

from Alfe’s answer in the below link, I can use the code below get all the values of the key in nested dict.
Find all occurrences of a key in nested python dictionaries and lists

data = {'item1': {
  'name': 'dummy',
  'type': 'dummy1'},

'item2': {
  'name': 'dummy',
  'type': 'dummy1',
  'label':'label2'
},

'item3': {
  'name': 'dummy',
  'type': 'dummy1',
  'label':'label3'},

'item4': {
  'name': 'dummy',
  'type': 'dummy1'}
}

 def find(key, dictionary):
    for k, v in dictionary.items():
        if k == key:
            yield v
        elif isinstance(v, dict):
            for result in find(key, v):
                yield result
        elif isinstance(v, list):
            for d in v:
                for result in find(key, d):
                    yield result


In[1]:list(find('label', data))
Out[1]: 
['label2', 'label3']

However, I also need to keep record of the outer dict key as below. How should I do this? Also my data can potentially have more than one layer.

{'item2':'label2',
'item3':'label3'
}

I also find the recursive_lookup in this link very neatly written. However, it’s returning None when I tried to run it.

Find keys in nested dictionary

def recursive_lookup(k, d):
    if k in d:
        return d[k]
    for v in d.values():
        if isinstance(v, dict):
            return recursive_lookup(k, v)
    return None

It’s returning None when I call recursive_lookup('label', data).

If anyone can point out for me why the above code is not working that would be great too!

Asked By: qshng

||

Answers:

First create a list like

outerKeyList = []

Then whenever you want to store a key such as before you return the item you were searching for simply run

outerKeyList.append(key). 

This will give you a convenient list of all the keys outside the recursive function.

Answered By: Jonathan Rogers

If you only have a single nested dict within your dict then you can use a dict comprehension:

In [9]: def find(label, data):
   ...:     return {outer_key: inner_val for outer_key, outer_val in data.items() for inner_key, inner_val in outer_val.items() if inner_key == label}
   ...:

In [10]: find('label', data)
Out[10]: {'item2': 'label2', 'item3': 'label3'}
Answered By: aydow

This should work regardless of how deep your nesting is (up to the stack limit, at any rate). The request for keeping track of the dict’s key is a little awkward–I used a tuple to return the pair. Note that if the found value is in the outermost dictionary, it won’t be in the tuple format.

def recursive_lookup(key, d):
    if key in d:
        return d[key]

    for k, v in d.items():
        if isinstance(v, dict):
            result = recursive_lookup(key, v)

            if result:
                return k, result


print(recursive_lookup('label', data))

Output:

('item2', 'label2')

Here’s a version that’s a little messier (I’m not crazy about an inner function, but at least the accumulator list isn’t a parameter and isn’t global) but will return a list of all found items nested up to the stack limit, excepting the outermost keys:

def recursive_lookup(key, d):
    def _lookup(key, d):
        if key in d:
            return d[key]

        for k, v in d.items():
            if isinstance(v, dict):
                result = _lookup(key, v)

                if result:
                    accumulator.append((k, result))

    accumulator = []
    _lookup(key, d)
    return accumulator

Output:

[('item3', 'label3'), ('item2', 'label2')]

This can be easily modified if you want to output a dict–replace accumulator = [] with accumulator = {} and accumulator.append((k, result)) with accumulator[k] = result, but this might be awkward to work with, and you can’t store duplicate key entries.

As for your final question, the reason you’re getting None is because the inner loop returns after checking the first item whether it found something or not. Since label is in the second location of the items() array, it never gets looked at.

Answered By: ggorlen

functions returns the path as well as value as a list of tuple.

def dict_key_lookup(_dict, key, path=[]):
    results = []
    if isinstance(_dict, dict):
        if key in _dict:
            results.append((path+[key], _dict[key]))
        else:
            for k, v in _dict.items():
                results.extend(dict_key_lookup(v, key, path= path+[k]))
    elif isinstance(_dict, list):
        for index, item in enumerate(_dict):
            results.extend(dict_key_lookup(item, key, path= path+[index]))
    return results

Hope this helps.

Answered By: Pradeep Pathak

You can use a NestedDict

from ndicts import NestedDict

data = {'item1': {'name': 'dummy', 'type': 'dummy1'},
        'item2': {'label': 'label2', 'name': 'dummy', 'type': 'dummy1'},
        'item3': {'label': 'label3', 'name': 'dummy', 'type': 'dummy1'},
        'item4': {'name': 'dummy', 'type': 'dummy1'}}
nd = NestedDict(data)

nd_filtered= NestedDict()
for key, value in nd.items():
    if "label" in key:
        new_key = tuple(level for level in key if level != "label")
        nd_filtered[new_key] = value
>>> nd_filtered
NestedDict({'item2': 'label2', 'item3': 'label3'})
>>> nd_filtered.to_dict()
{'item2': 'label2', 'item3': 'label3'}

You could also consider .extract, even though it will not give you exactly the output that you were asking for

>>> nd.extract["", "label"]
NestedDict({'item2': {'label': 'label2'}, 'item3': {'label': 'label3'}})

To install ndicts pip install ndicts

Answered By: edd313
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.