Merge multiple dictionaries within a list by primary key

Question:

Following question:

I have a list of dictionaries like:

l = [{"primary_key": 5164, "attribute1": 2},
     {"primary_key": 5162, "attribute1": 3}, 
     {"primary_key": 5164, "attribute2": 3}] 

I want to join them into an output list, but merging the dictionaries that have the same primary-key as:

output_l = [{"primary_key": 5164, "attribute1": 2, "attribute2": 3},
            {"primary_key": 5162, "attribute1": 3}]

After searching a lot I did not find this question somewhere else. Sorry if it is a duplicate. It is very related to this question, but with only one list instead of multiple.

Asked By: KevinYanesG

||

Answers:

Create a dictionary where you store the ever growing dictionaries by their primary key. The following will work:

dicts = {}
for dct in l:
    dicts.setdefault(dct["primary_key"], {}).update(dct)

l = [*dicts.values()]
# [{'primary_key': 5164, 'attribute1': 2, 'attribute2': 3},
#  {'primary_key': 5162, 'attribute1': 3}]

Note that shared attribute name, the latest entry will "win".

Answered By: user2390182

Here is another approach:

l = [{"primary_key": 5164, "attribute1": 2},
     {"primary_key": 5162, "attribute1": 3}, 
     {"primary_key": 5164, "attribute2": 3},
     {"primary_key": 5162, "attribute2": 3},
     {"primary_key": 5163, "attribute1": 4}] 

# Final list
out_list = []

# Dict to maintain the position where particular primary key was inserted in out_list
visited_keys = {}
index = 0
for item in l:
    # If this key is already in out_list, update it with new key:value
    if item['primary_key'] in visited_keys.keys():
        out_list[visited_keys[item["primary_key"]]].update(item)
    else:
        # Key is not visited, so just append it to the out_list
        out_list.append(item)
        visited_keys[item["primary_key"]] = index
        index += 1
        
print (out_list)

Output:

[{'primary_key': 5164, 'attribute1': 2, 'attribute2': 3}, {'primary_key': 5162, 'attribute1': 3, 'attribute2': 3}, {'primary_key': 5163, 'attribute1': 4}]
Answered By: Bhagyesh Dudhediya
from _collections import defaultdict
from operator import itemgetter


l = [{"primary_key": 64, "attribute1": 2},
     {"primary_key": 62, "attribute1": 3}, 
     {"primary_key": 64, "attribute2": 3}] 


def m1(l):
    d= defaultdict(dict)
    for innerdict in l :
        d[innerdict['primary_key']].update(innerdict)
    dv= d.values()
    dvsorted = sorted( d.values(),key = itemgetter('primary_key') )
    dvsorted1 = [*dvsorted]
    dvsorted1_print = print(dvsorted1)
    return dvsorted1_print

m1(l)
        
"""
Output :
[{'primary_key': 62, 'attribute1': 3}, {'primary_key': 64, 'attribute1': 2, 'attribute2': 3}]

"""

Explanation :

  1. The function m1 takes a list of dictionaries as input.
  2. The function creates a defaultdict with the "primary_key" as the key and the innerdict as the value.
  3. The function then updates the defaultdict with the innerdict.
  4. The function then sorts the defaultdict by the primary_key.
  5. The function then prints the sorted defaultdict.
Answered By: Soudipta Dutta
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.