Multi-level defaultdict with variable depth?

Question:

I have a large list like:

[A][B1][C1]=1
[A][B1][C2]=2
[A][B2]=3
[D][E][F][G]=4

I want to build a multi-level dict like:

A
--B1
-----C1=1
-----C2=1
--B2=3
D
--E
----F
------G=4

I know that if I use recursive defaultdict I can write table[A][B1][C1]=1, table[A][B2]=2, but this works only if I hardcode those insert statement.

While parsing the list, I don’t how many []’s I need beforehand to call table[key1][key2][...].

Asked By: Wei Shi

||

Answers:

Your example says that at any level there can be a value, and also a dictionary of sub-elements. That is called a tree, and there are many implementations available for them. This is one:

from collections import defaultdict
class Tree(defaultdict):
    def __init__(self, value=None):
        super(Tree, self).__init__(Tree)
        self.value = value

root = Tree()
root.value = 1
root['a']['b'].value = 3
print root.value
print root['a']['b'].value
print root['c']['d']['f'].value

Outputs:

1
3
None

You could do something similar by writing the input in JSON and using json.load to read it as a structure of nested dictionaries.

Answered By: Apalala

I’d do it with a subclass of dict that defines __missing__:

>>> class NestedDict(dict):
...     def __missing__(self, key):
...             self[key] = NestedDict()
...             return self[key]
...
>>> table = NestedDict()
>>> table['A']['B1']['C1'] = 1
>>> table
{'A': {'B1': {'C1': 1}}}

You can’t do it directly with defaultdict because defaultdict expects the factory function at initialization time, but at initialization time, there’s no way to describe the same defaultdict. The above construct does the same thing that default dict does, but since it’s a named class (NestedDict), it can reference itself as missing keys are encountered. It is also possible to subclass defaultdict and override __init__.

Answered By: Jason R. Coombs

You can do it without even defining a class:

from collections import defaultdict

nested_dict = lambda: defaultdict(nested_dict)
nest = nested_dict()

nest[0][1][2][3][4][5] = 6
Answered By: Hugo Walter

I think the simplest implementation of a recursive dictionary is this. Only leaf nodes can contain values.

# Define recursive dictionary
from collections import defaultdict
tree = lambda: defaultdict(tree)

Usage:

# Create instance
mydict = tree()

mydict['a'] = 1
mydict['b']['a'] = 2
mydict['c']
mydict['d']['a']['b'] = 0

# Print
import prettyprint
prettyprint.pp(mydict)

Output:

{
  "a": 1, 
  "b": {
    "a": 1
  }, 
  "c": {},
  "d": {
    "a": {
      "b": 0
    }
  }
}
Answered By: Bouke Versteegh

This is equivalent to the above, but avoiding lambda notation. Perhaps easier to read ?

def dict_factory():
   return defaultdict(dict_factory)

your_dict = dict_factory()

Also — from the comments — if you’d like to update from an existing dict, you can simply call

your_dict[0][1][2].update({"some_key":"some_value"})

In order to add values to the dict.

Answered By: gabe

A slightly different possibility that allows regular dictionary initialization:

from collections import defaultdict

def superdict(arg=()):
    update = lambda obj, arg: obj.update(arg) or obj
    return update(defaultdict(superdict), arg)

Example:

>>> d = {"a":1}
>>> sd = superdict(d)
>>> sd["b"]["c"] = 2
Answered By: Vincent

Dan O’Huiginn posted a very nice solution on his journal in 2010:

http://ohuiginn.net/mt/2010/07/nested_dictionaries_in_python.html

>>> class NestedDict(dict):
...     def __getitem__(self, key):
...         if key in self: return self.get(key)
...         return self.setdefault(key, NestedDict())


>>> eggs = NestedDict()
>>> eggs[1][2][3][4][5]
{}
>>> eggs
{1: {2: {3: {4: {5: {}}}}}}
Answered By: Dvd Avins

To add to @Hugo
To have a max depth:

l=lambda x:defaultdict(lambda:l(x-1)) if x>0 else defaultdict(dict)
arr = l(2)
Answered By: firecraker180

You may achieve this with a recursive defaultdict.

from collections import defaultdict

def tree():
    def the_tree():
        return defaultdict(the_tree)
    return the_tree()

It is important to protect the default factory name, the_tree here, in a closure (“private” local function scope). Avoid using a one-liner lambda version, which is bugged due to Python’s late binding closures, and implement this with a def instead.

The accepted answer, using a lambda, has a flaw where instances must rely on the nested_dict name existing in an outer scope. If for whatever reason the factory name can not be resolved (e.g. it was rebound or deleted) then pre-existing instances will also become subtly broken:

>>> nested_dict = lambda: defaultdict(nested_dict)
>>> nest = nested_dict()
>>> nest[0][1][2][3][4][6] = 7
>>> del nested_dict
>>> nest[8][9] = 10
# NameError: name 'nested_dict' is not defined
Answered By: wim

You could use a NestedDict.

from ndicts.ndicts import NestedDict

nd = NestedDict()
nd[0, 1, 2, 3, 4, 5] = 6

The result as a dictionary:

>>> nd.to_dict()
{0: {1: {2: {3: {4: {5: 6}}}}}}

To install ndicts

pip install ndicts
Answered By: edd313
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.