Merge several Python dictionaries

Question:

I have to merge list of python dictionary. For eg:

dicts[0] = {'a':1, 'b':2, 'c':3}
dicts[1] = {'a':1, 'd':2, 'c':'foo'}
dicts[2] = {'e':57,'c':3}

super_dict = {'a':[1], 'b':[2], 'c':[3,'foo'], 'd':[2], 'e':[57]}    

I wrote the following code:

super_dict = {}
for d in dicts:
    for k, v in d.items():
        if super_dict.get(k) is None:
            super_dict[k] = []
        if v not in super_dict.get(k):
            super_dict[k].append(v)

Can it be presented more elegantly / optimized?

Note
I found another question on SO but its about merging exactly 2 dictionaries.

Asked By: jerrymouse

||

Answers:

This may be a bit more elegant:

super_dict = {}
for d in dicts:
    for k, v in d.iteritems():
        l=super_dict.setdefault(k,[])
        if v not in l:
            l.append(v)

UPDATE: made change suggested by Sven

UPDATE: changed to avoid duplicates (thanks Marcin and Steven)

Answered By: Vaughn Cato

Merge the keys of all dicts, and for each key assemble the list of values:

super_dict = {}
for k in set(k for d in dicts for k in d):
    super_dict[k] = [d[k] for d in dicts if k in d]

The expression set(k for d in dicts for k in d) builds a set of all unique keys of all dictionaries. For each of these unique keys, we use the list comprehension [d[k] for d in dicts if k in d] to build the list of values from all dicts for this key.

Since you only seem to one the unique value of each key, you might want to use sets instead:

super_dict = {}
for k in set(k for d in dicts for k in d):
    super_dict[k] = set(d[k] for d in dicts if k in d)
Answered By: Sven Marnach

You can iterate over the dictionaries directly — no need to use range. The setdefault method of dict looks up a key, and returns the value if found. If not found, it returns a default, and also assigns that default to the key.

super_dict = {}
for d in dicts:
    for k, v in d.iteritems():  # d.items() in Python 3+
        super_dict.setdefault(k, []).append(v)

Also, you might consider using a defaultdict. This just automates setdefault by calling a function to return a default value when a key isn’t found.

import collections
super_dict = collections.defaultdict(list)
for d in dicts:
    for k, v in d.iteritems():  # d.items() in Python 3+
        super_dict[k].append(v)

Also, as Sven Marnach astutely observed, you seem to want no duplication of values in your lists. In that case, set gets you what you want:

import collections
super_dict = collections.defaultdict(set)
for d in dicts:
    for k, v in d.iteritems():  # d.items() in Python 3+
        super_dict[k].add(v)
Answered By: senderle

Never forget that the standard libraries have a wealth of tools for dealing with dicts and iteration:

from itertools import chain
from collections import defaultdict
super_dict = defaultdict(list)
for k,v in chain.from_iterable(d.iteritems() for d in dicts):
    if v not in super_dict[k]: super_dict[k].append(v)

Note that the if v not in super_dict[k] can be avoided by using defaultdict(set) as per Steven Rumbalski’s answer.

Answered By: Marcin
from collections import defaultdict

dicts = [{'a':1, 'b':2, 'c':3},
         {'a':1, 'd':2, 'c':'foo'},
         {'e':57, 'c':3} ]

super_dict = defaultdict(set)  # uses set to avoid duplicates

for d in dicts:
    for k, v in d.items():  # use d.iteritems() in python 2
        super_dict[k].add(v)
Answered By: Steven Rumbalski

I’m a bit late to the game but I did it in 2 lines with no dependencies beyond python itself:

flatten = lambda *c: (b for a in c for b in (flatten(*a) if isinstance(a, (tuple, list)) else (a,)))
o = reduce(lambda d1,d2: dict((k, list(flatten([d1.get(k), d2.get(k)]))) for k in set(d1.keys() + d2.keys())), dicts)
# output:
# {'a': [1, 1, None], 'c': [3, 'foo', 3], 'b': [2, None, None], 'e': [None, 57], 'd': [None, 2, None]}

Though if you don’t care about nested lists, then:

o2 = reduce(lambda d1,d2: dict((k, [d1.get(k), d2.get(k)]) for k in set(d1.keys() + d2.keys())), dicts)
# output:
# {'a': [[1, 1], None], 'c': [[3, 'foo'], 3], 'b': [[2, None], None], 'e': [None, 57], 'd': [[None, 2], None]}
Answered By: platinummonkey

For a oneliner, the following could be used:

{key: {d[key] for d in dicts if key in d} for key in {key for d in dicts for key in d}}

although readibility would benefit from naming the combined key set:

combined_key_set = {key for d in dicts for key in d}
super_dict = {key: {d[key] for d in dicts if key in d} for key in combined_key_set}

Elegance can be debated but personally I prefer comprehensions over for loops. 🙂

(The dictionary and set comprehensions are available in Python 2.7/3.1 and newer.)

Answered By: 7mp

It seems like most of the answers using comprehensions are not all that readable. In case any gets lost in the mess of answers above this might be helpful (although extremely late…). Just loop over the items of each dict and place them in a separate one.

super_dict = {key:val for d in dicts for key,val in d.items()}
Answered By: pbreach

My solution is similar to @senderle proposed, but instead of for loop I used map

super_dict = defaultdict(set)
map(lambda y: map(lambda x: super_dict[x].add(y[x]), y), dicts)
Answered By: MosheZada

If you assume that the keys in which you are interested are at the same nested level, you can recursively traverse each dictionary and create a new dictionary using that key, effectively merging them.

merged = {}
for d in dicts:
    def walk(d,merge):
        for key, item in d.items():
            if isinstance(item, dict):
                merge.setdefault(key, {})
                walk(item, merge[key])
            else:
                merge.setdefault(key, [])
                merge[key].append(item)
    walk(d,merged)

For example, say you have the following dictionaries you want to merge.

dicts = [{'A': {'A1': {'FOO': [1,2,3]}}},
         {'A': {'A1': {'A2': {'BOO': [4,5,6]}}}},
         {'A': {'A1': {'FOO': [7,8]}}},
         {'B': {'B1': {'COO': [9]}}},
         {'B': {'B2': {'DOO': [10,11,12]}}},
         {'C': {'C1': {'C2': {'POO':[13,14,15]}}}},
         {'C': {'C1': {'ROO': [16,17]}}}]

Using the key at each level, you should get something like this:

{'A': {'A1': {'FOO': [[1, 2, 3], [7, 8]], 
              'A2': {'BOO': [[4, 5, 6]]}}},
 'B': {'B1': {'COO': [[9]]}, 
       'B2': {'DOO': [[10, 11, 12]]}},
 'C': {'C1': {'C2': {'POO': [[13, 14, 15]]}, 
              'ROO': [[16, 17]]}}}

Note: I assume the leaf at each branch is a list of some kind, but you can obviously change the logic to do whatever is necessary for your situation.

Answered By: davini

The use of defaultdict is good, this also can be done with the use of itertools.groupby.

import itertools
# output all dict items, and sort them by key
dicts_ele = sorted( ( item for d in dicts for item in d.items() ), key = lambda x: x[0] )
# groups items by key
ele_groups = itertools.groupby( dicts_ele, key = lambda x: x[0] )
# iterates over groups and get item value
merged = { k: set( v[1] for v in grouped ) for k, grouped in ele_groups }

and obviously, you can merge this block of code into one-line style

merged = {
    k: set( v[1] for v in grouped )
    for k, grouped in (
        itertools.groupby(
            sorted(
                ( item for d in dicts for item in d.items() ),
                key = lambda x: x[0]
            ),
            key = lambda x: x[0]
        )
    )
}
Answered By: Sphynx-HenryAY

When the value of the keys are in list:

from collections import defaultdict

    dicts = [{'a':[1], 'b':[2], 'c':[3]},
             {'a':[11], 'd':[2], 'c':['foo']},
             {'e':[57], 'c':[3], "a": [1]} ]

super_dict = defaultdict(list)  # uses set to avoid duplicates

for d in dicts:
    for k, v in d.items():  # use d.iteritems() in python 2
        super_dict[k] = list(set(super_dict[k] + v))

combined_dict = {}

for elem in super_dict.keys():
    combined_dict[elem] = super_dict[elem]

combined_dict
## output: {'a': [1, 11], 'b': [2], 'c': [3, 'foo'], 'd': [2], 'e': [57]}
Answered By: Ramkrishan Sahu

you can use this behaviour of dict. (a bit elegant)

 a = {'a':1, 'b':2, 'c':3}
 b = {'d':1, 'e':2, 'f':3}
 c = {1:1, 2:2, 3:3}
 merge = {**a, **b, **c}
 print(merge) # {'a': 1, 'b': 2, 'c': 3, 'd': 1, 'e': 2, 'f': 3, 1: 1, 2: 2, 3: 3}

and you are good to go 🙂

Answered By: G_kuldeep

I have a very easy to go solution without any imports.
I use the dict.update() method.
But sadly it will overwrite, if same key appears in more than one dictionary, then the most recently merged dict’s value will appear in the output.

dict1 = {'Name': 'Zara', 'Age': 7}
dict2 = {'Sex': 'female' }
dict3 = {'Status': 'single', 'Age': 27}
dict4 = {'Occupation':'nurse', 'Wage': 3000}

def mergedict(*args):
    output = {}
    for arg in args:
        output.update(arg)
    return output
    
print(mergedict(dict1, dict2, dict3, dict4))

The output is this:

{‘Name’: ‘Zara’, ‘Age’: 27, ‘Sex’: ‘female’, ‘Status’: ‘single’, ‘Occupation’: ‘nurse’, ‘Wage’: 3000}

Answered By: blQSheep

python 3.x (reduce is builtin for python 2.x, so no need to import if in 2.x)

import operator
from functools import operator.add

a = [{'a': 1}, {'b': 2}, {'c': 3, 'd': 4}]

dict(reduce(operator.add, map(list,(map(dict.items, a))))

map(dict.items, a) # converts to list of key, value iterators

map(list, ... # converts to iterator equivalent of [[[a, 1]], [[b, 2]], [[c, 3],[d,4]]]

reduce(operator.add, ... # reduces the multiple list down to a single list

Answered By: ElbowPipe

Perhaps a more modern and concise approach for those who use python 3.3 or later versions is the use of ChainMap from the collections module.

from collections import ChainMap

d1 = {'a': 1, 'b': 3}
d2 = {'c': 2}
d3 = {'d': 7, 'a': 9}
d4 = {}
combo = dict(ChainMap(d1, d2, d3, d4))
# {'d': 7, 'a': 1, 'c': 2, 'b': 3}

For a larger collection of dict objects then star operator works

dict(ChainMap(*dict_collection))

Note that the resulting dictionary seems to only keep the value of the first key it encounters in the ordered collection and ignores any further duplicates.

Answered By: LurkerZ

This is a more recent enhancement over the prior answer by ElbowPipe, using newer syntax introduced in Python 3.9 for merging dictionaries. Note that this answer does not merge conflicting values into a list!

> import functools
> import operator

> functools.reduce(operator.or_, [{0:1}, {2:3, 4:5}, {2:6}])

{0: 1, 2: 6, 4: 5}
Answered By: Asclepius
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.