Merge two list of dicts based on an index in the dicts

Question:

These two are related dataset, but coming from seperate json files, so I would like to merge them. They can match on index, but I did not really find a good way of doing this 🙂

List of dicts 1:

[
  {'index': 217, 'name': 'Battery'}
  {'index': 218, 'name': 'Fluffy'}
  {'index': 219, 'name': 'Dazzling'}
  {'index': 220, 'name': 'Soul-Heart'}
]

List of dicts 2:

[
  {'index': 217, 'desc': 'Text info 2'}
  {'index': 218, 'desc': 'will be very informative'}
  {'index': 219, 'desc': 'dont know what else i could write here'}
  {'index': 220, 'desc': 'Boosts my wallet'}
]

Result should be something like:

[
  {'index': 217, 'name': 'Battery', 'desc': 'Text info 2'}
  {'index': 218, 'name': 'Fluffy', 'desc': 'will be very informative'}
  {'index': 219, 'name': 'Dazzling', 'desc': 'dont know what else i could write here'}
  {'index': 220, 'name': 'Soul-Heart', 'desc': 'Boosts my wallet'}
]

There is a lot more data, but as soon as I know how to merge, i think i can do the rest

Asked By: TLListenreich

||

Answers:

To merge two dictionaries in Python that have a common key-value pair, you can use the update() method on one of the dictionaries. This method will overwrite the common key-value pair with the value from the second dictionary.

dict1.update(dict2)

this should give you the intended result but if common key-value pair differs it will be picked from the second dictionary and first dictionary value will be overwritten.

Answered By: aditya singh

I’m assuming the values in index key are unique in each list:

lst1 = [
    {"index": 217, "name": "Battery"},
    {"index": 218, "name": "Fluffy"},
    {"index": 219, "name": "Dazzling"},
    {"index": 220, "name": "Soul-Heart"},
]

lst2 = [
    {"index": 217, "desc": "Text info 2"},
    {"index": 218, "desc": "will be very informative"},
    {"index": 219, "desc": "dont know what else i could write here"},
    {"index": 220, "desc": "Boosts my wallet"},
]


tmp1 = {d["index"]: d["name"] for d in lst1}
tmp2 = {d["index"]: d["desc"] for d in lst2}

out = []
for k in tmp1.keys() & tmp2.keys():
    out.append(
        {"index": k, "name": tmp1.get(k, "N/A"), "desc": tmp2.get(k, "N/A")}
    )

print(out)

Prints:

[
    {"index": 217, "name": "Battery", "desc": "Text info 2"},
    {"index": 218, "name": "Fluffy", "desc": "will be very informative"},
    {
        "index": 219,
        "name": "Dazzling",
        "desc": "dont know what else i could write here",
    },
    {"index": 220, "name": "Soul-Heart", "desc": "Boosts my wallet"},
]
Answered By: Andrej Kesely

This will work, but it’s very inefficient since you have to iterate over each dictionary of each list. It’d be better if you’re able to change the data structures so you’re not iterating over dicts in a list.
See if it’d be possible to change it so you have something like…

dicts2 = {
  217 : {'desc': 'Text info 2'},
  218 : {'desc': 'will be very informative'},
  219 : {'desc': 'dont know what else i could write here'},
  220 : {'desc': 'Boosts my wallet'}
}

This way you take advantage of dict as a data structure and look up the item you want instead of iterating over each item (as with a list)

But here’s your solution as it is now:

dicts1 = [
  {'index': 217, 'name': 'Battery'},
  {'index': 218, 'name': 'Fluffy'},
  {'index': 219, 'name': 'Dazzling'},
  {'index': 220, 'name': 'Soul-Heart'}
]

dicts2 = [
  {'index': 217, 'desc': 'Text info 2'},
  {'index': 218, 'desc': 'will be very informative'},
  {'index': 219, 'desc': 'dont know what else i could write here'},
  {'index': 220, 'desc': 'Boosts my wallet'}
]

for d1 in dicts1:
    for d2 in dicts2:
        if d1['index'] == d2['index']:
            for key, value in d1.items():
                d2[key] = value
            
print(dicts2)
--------------------------------------------------------------
[
  {'index': 217, 'desc': 'Text info 2', 'name': 'Battery'}, 
  {'index': 218, 'desc': 'will be very informative', 'name': 'Fluffy'}, 
  {'index': 219, 'desc': 'dont know what else i could write here', 'name': 'Dazzling'}, 
  {'index': 220, 'desc': 'Boosts my wallet', 'name': 'Soul-Heart'}
]
Answered By: coniferous

Pandas handles merges like a breeze.

First convert the data into dataframes:

import pandas as pd

data1 = [
    {'index': 217, 'name': 'Battery'},
    {'index': 218, 'name': 'Fluffy'},
    {'index': 219, 'name': 'Dazzling'},
    {'index': 220, 'name': 'Soul-Heart'},
]
data2 = [
    {'index': 217, 'desc': 'Text info 2'},
    {'index': 218, 'desc': 'will be very informative'},
    {'index': 219, 'desc': 'dont know what else i could write here'},
    {'index': 220, 'desc': 'Boosts my wallet'},
]
df1 = pd.DataFrame(data1)
df2 = pd.DataFrame(data2)

Then merge on the index column:

df_out = df1.merge(df2, on='index')

Which looks like this:

   index        name                                    desc
0    217     Battery                             Text info 2
1    218      Fluffy                will be very informative
2    219    Dazzling  dont know what else i could write here
3    220  Soul-Heart                        Boosts my wallet

Then convert back to lists of dicts:

df_out.to_dict(orient='records')
[{'index': 217, 'name': 'Battery', 'desc': 'Text info 2'},
 {'index': 218, 'name': 'Fluffy', 'desc': 'will be very informative'},
 {'index': 219, 'name': 'Dazzling', 'desc': 'dont know what else i could write here'},
 {'index': 220, 'name': 'Soul-Heart', 'desc': 'Boosts my wallet'}]
Answered By: wjandrea

To do this merge for the lists, use this:

a = [{'index': 217, 'name': 'Battery'},{'index': 218, 'name': 'Fluffy'},{'index': 219, 'name': 'Dazzling'}, {'index': 220, 'name': 'Soul-Heart'}]

b = [{'index': 217, 'desc': 'Text info 2'},{'index': 218, 'desc': 'will be very informative'},{'index': 219, 'desc': 'dont know what else i could write here'},{'index': 220, 'desc': 'Boosts my wallet'}]


final_lst = []
for first, second in zip(a,b):
    first.update(second)
    final_lst.append(first)
print(final_lst)
Answered By: Ilori Temitayo
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.