Is there a recursive version of the dict.get() built-in?

Question:

I have a nested dictionary object and I want to be able to retrieve values of keys with an arbitrary depth. I’m able to do this by subclassing dict:

>>> class MyDict(dict):
...     def recursive_get(self, *args, **kwargs):
...         default = kwargs.get('default')
...         cursor = self
...         for a in args:
...             if cursor is default: break
...             cursor = cursor.get(a, default)
...         return cursor
... 
>>> d = MyDict(foo={'bar': 'baz'})
>>> d
{'foo': {'bar': 'baz'}}
>>> d.get('foo')
{'bar': 'baz'}
>>> d.recursive_get('foo')
{'bar': 'baz'}
>>> d.recursive_get('foo', 'bar')
'baz'
>>> d.recursive_get('bogus key', default='nonexistent key')
'nonexistent key'

However, I don’t want to have to subclass dict to get this behavior. Is there some built-in method that has equivalent or similar behavior? If not, are there any standard or external modules that provide this behavior?

I’m using Python 2.7 at the moment, though I would be curious to hear about 3.x solutions as well.

Asked By: jayhendren

||

Answers:

There is none that I am aware of. However, you don’t need to subclass dict at all, you can just write a function that takes a dictionary, args and kwargs and does the same thing:

 def recursive_get(d, *args, **kwargs):
     default = kwargs.get('default')
     cursor = d
     for a in args:
         if cursor is default: break
         cursor = recursive_get(cursor, a, default)
     return cursor 

use it like this

recursive_get(d, 'foo', 'bar')
Answered By: nicebyte

collections.default_dict will handle the providing of default values for nonexistent keys at least.

Answered By: talwai

A very common pattern to do this is to use an empty dict as your default:

d.get('foo', {}).get('bar')

If you have more than a couple of keys, you could use reduce (note that in Python 3 reduce must be imported: from functools import reduce) to apply the operation multiple times

reduce(lambda c, k: c.get(k, {}), ['foo', 'bar'], d)

Of course, you should consider wrapping this into a function (or a method):

def recursive_get(d, *keys):
    return reduce(lambda c, k: c.get(k, {}), keys, d)
Answered By: Thomas Orozco

You can actually achieve this really neatly in Python 3, given its handling of default keyword arguments and tuple decomposition:

In [1]: def recursive_get(d, *args, default=None):
   ...:     if not args:
   ...:         return d
   ...:     key, *args = args
   ...:     return recursive_get(d.get(key, default), *args, default=default)
   ...: 

Similar code will also work in python 2, but you’d need to revert to using **kwargs, as you did in your example. You’d also need to use indexing to decompose *args.

In any case, there’s no need for a loop if you’re going to make the function recursive anyway.

You can see that the above code demonstrates the same functionality as your existing method:

In [2]: d = {'foo': {'bar': 'baz'}}

In [3]: recursive_get(d, 'foo')
Out[3]: {'bar': 'baz'}

In [4]: recursive_get(d, 'foo', 'bar')
Out[4]: 'baz'

In [5]: recursive_get(d, 'bogus key', default='nonexistent key')
Out[5]: 'nonexistent key'
Answered By: sapi

You can use a defaultdict to give you an empty dict on missing keys:

from collections import defaultdict
mydict = defaultdict(dict)

This only goes one level deep – mydict[missingkey] is an empty dict, mydict[missingkey][missing key] is a KeyError. You can add as many levels as needed by wrapping it in more defaultdicts, eg defaultdict(defaultdict(dict)). You could also have the innermost one as another defaultdict with a sensible factory function for your use case, eg

mydict = defaultdict(defaultdict(lambda: 'big summer blowout'))

If you need it to go to arbitrary depth, you can do that like so:

def insanity():
    return defaultdict(insanity)

print(insanity()[0][0][0][0])
Answered By: lvc

@ThomasOrozco’s solution is correct, but resorts to a lambda function, which is only necessary to avoid TypeError if an intermediary key does not exist. If this isn’t a concern, you can use dict.get directly:

from functools import reduce

def get_from_dict(dataDict, mapList):
    """Iterate nested dictionary"""
    return reduce(dict.get, mapList, dataDict)

Here’s a demo:

a = {'Alice': {'Car': {'Color': 'Blue'}}}  
path = ['Alice', 'Car', 'Color']
get_from_dict(a, path)  # 'Blue'

If you wish to be more explicit than using lambda while still avoiding TypeError, you can wrap in a try / except clause:

def get_from_dict(dataDict, mapList):
    """Iterate nested dictionary"""
    try:
        return reduce(dict.get, mapList, dataDict)
    except TypeError:
        return None  # or some other default value

Finally, if you wish to raise KeyError when a key does not exist at any level, use operator.getitem or dict.__getitem__:

from functools import reduce
from operator import getitem

def getitem_from_dict(dataDict, mapList):
    """Iterate nested dictionary"""
    return reduce(getitem, mapList, dataDict)
    # or reduce(dict.__getitem__, mapList, dataDict)

Note that [] is syntactic sugar for the __getitem__ method. So this relates precisely how you would ordinarily access a dictionary value. The operator module just provides a more readable means of accessing this method.

Answered By: jpp

The Iterative Solution

def deep_get(d:dict, keys, default=None, create=True):
    if not keys:
        return default
    
    for key in keys[:-1]:
        if key in d:
            d = d[key]
        elif create:
            d[key] = {}
            d = d[key]
        else:
            return default
    
    key = keys[-1]
    
    if key in d:
        return d[key]
    elif create:
        d[key] = default
    
    return default


def deep_set(d:dict, keys, value, create=True):
    assert(keys)
    
    for key in keys[:-1]:
        if key in d:
            d = d[key]
        elif create:
            d[key] = {}
            d = d[key]
    
    d[keys[-1]] = value 
    return value

I am about to test it inside of a Django project with a line such as:

keys = ('options', 'style', 'body', 'name')

val = deep_set(d, keys, deep_get(s, keys, 'dotted'))

The OP requested the following behavior

>>> d.recursive_get('bogus key', default='nonexistent key')
'nonexistent key'

(As of June 15, 22022) none of the up-voted answers accomplish this, so I have modified @ThomasOrozco’s solution to resolve this

from functools import reduce

def rget(d, *keys, default=None):
    """Use a sentinel to handle both missing keys AND alternate default values"""
    sentinel = {}
    v = reduce(lambda c, k: c.get(k, sentinel), keys, d)
    if v is sentinel:
        return default
    return v

Below is a complete, unit-test-like demonstration of where the other answers have issues. I’ve named each approach according to its author. Note that this answer is the only one which passes all 4 test cases, namely

  1. Basic retrieval when key-tree exists
  2. Non-existent key-tree returns None
  3. Option to specify a default aside from None
  4. Values which are an empty dict should return as themselves rather than the default
from functools import reduce


def thomas_orozco(d, *keys):
    return reduce(lambda c, k: c.get(k, {}), keys, d)


def jpp(dataDict, *mapList):
    """Same logic as thomas_orozco but exits at the first missing key instead of last"""
    try:
        return reduce(dict.get, *mapList, dataDict)
    except TypeError:
        return None


def sapi(d, *args, default=None):
    if not args:
        return d
    key, *args = args
    return sapi(d.get(key, default), *args, default=default)


def rget(d, *keys, default=None):
    sentinel = {}
    v = reduce(lambda c, k: c.get(k, sentinel), keys, d)
    if v is sentinel:
        return default
    return v


def assert_rget_behavior(func):
    """Unit tests for desired behavior of recursive dict.get()"""
    fail_count = 0

    # Basic retrieval when key-tree exists
    d = {'foo': {'bar': 'baz', 'empty': {}}}
    try:
        v = func(d, 'foo', 'bar')
        assert v == 'baz', f'Unexpected value {v} retrieved'
    except Exception as e:
        print(f'Case 1: Failed basic retrieval with {repr(e)}')
        fail_count += 1

    # Non-existent key-tree returns None
    try:
        v = func(d, 'bogus', 'key')
        assert v is None, f'Missing key retrieved as {v} instead of None'
    except Exception as e:
        print(f'Case 2: Failed missing retrieval with {repr(e)}')
        fail_count += 1

    # Option to specify a default aside from None
    default = 'alternate'
    try:
        v = func(d, 'bogus', 'key', default=default)
        assert v == default, f'Missing key retrieved as {v} instead of {default}'
    except Exception as e:
        print(f'Case 3: Failed default retrieval with {repr(e)}')
        fail_count += 1

    # Values which are an empty dict should return as themselves rather than the default
    try:
        v = func(d, 'foo', 'empty')
        assert v == {}, f'Empty dict value retrieved as {v} instead of {{}}'
    except Exception as e:
        print(f'Case 4: Failed retrieval of empty dict value with {repr(e)}')
        fail_count += 1

    # Success only if all pass
    if fail_count == 0:
        print('Passed all tests!')


if __name__ == '__main__':

    assert_rget_behavior(thomas_orozco)  # Fails cases 2 and 3
    assert_rget_behavior(jpp)  # Fails cases 1, 3, and 4
    assert_rget_behavior(sapi)  # Fails cases 2 and 3

    assert_rget_behavior(rget)  # Only one to pass all 3
Answered By: Addison Klinke