How to merge duplicate dicts in list in Python

Question:

I have below list:

lst = [
       {'name': 'bezel', 'conf': 0.67},
       {'name': 'plate', 'conf': 0.69},
       {'name': 'bezel', 'conf': 0.65},
       {'name': 'plate', 'conf': 0.46},
       {'name': 'bezel', 'conf': 0.42}
]

Above list contains duplicate dicts which has name as bezel and plate. There can be n number of names in the list. I want to remove these duplicate dicts and only keep the dict which has high conf. So the output would look like below:

lst = [
       {'name': 'bezel', 'conf': 0.67},
       {'name': 'plate', 'conf': 0.69}
]

I can use multiple for and if loops to get the output, but is there any easier way of doing this?

Below is what I have done so far:

newLst = []  # creating new list to save data
for item in lst:
    if any(d['name'] == item['name'] for d in newLst):  # Checking if current item exist in newLst
        idx = next((index for (index, d) in enumerate(newLst) if d["name"] == item['name']), None)  # Get the index of current item name from newLst
        if newLst[idx]['conf'] < item['conf']:  # Check if its greater conf
            del newLst[idx]
            newLst.append({'name': item['name'], 'conf': item['conf']})
    else:
        newLst.append({'name': item['name'], 'conf': item['conf']})   # If not add current item

print(newLst)
Asked By: S Andrew

||

Answers:

You can get a compact answer by creating a dictionary where each key is one of the possible names, if the name is not in the dict, add it with its conf, else update the maximum if needed.

def filter_on_key(lst):
    tmp = {}
    for d in lst:
    if tmp.get(d['name']) is None:
        tmp[d['name']] = d['conf']
    else:
        if d['conf'] > tmp[d['name']]:
            tmp[d['name']] = d['conf']
    return tmp

Gets you:

out = filter_on_key(lst)
{'bezel': 0.67, 'plate': 0.69}

If you want to get back the original format, a comprehension works fine:

res = [{'name':k, 'conf':v} for k, v in out.items()]
[{'name': 'bezel', 'conf': 0.67}, {'name': 'plate', 'conf': 0.69}]
Answered By: Nathan Furnal

You can do:

result=[]
for n in {d['name'] for d in lst}:
    sl=[e for e in lst if e['name']==n]
    result.append(max(sl, key=lambda x:x['conf']))

>>> result
[{'name': 'plate', 'conf': 0.69}, {'name': 'bezel', 'conf': 0.67}]
  1. {d['name'] for d in lst} creates a set of all possible 'name' keys contained in the list;

  2. sl=[e for e in lst if e['name']==n] filters for that name;

  3. max(sl, key=lambda x:x['conf']) find the max of that filter list keyed by the 'conf' key.

Profit!


If preserving order is important, use a dict rather than a set to uniquify the list:

result=[]
for n in {d['name']:None for d in lst}:
    sl=[e for e in lst if e['name']==n]
    result.append(max(sl, key=lambda x:x['conf']))

>>> result
[{'name': 'bezel', 'conf': 0.67}, {'name': 'plate', 'conf': 0.69}]

There is also the sort / uniquify method:

>>> list({d['name']:d for d in sorted(lst, key=lambda d: (d['name'], d['conf']))}.values())
[{'name': 'bezel', 'conf': 0.67}, {'name': 'plate', 'conf': 0.69}]
Answered By: dawg

You can use a dict to store input dicts by name. Then you just need to remove and replace them when their 'conf' value is higher.

accumulator = {}
for d in lst:
    key = d['name']
    if key not in accumulator:
        accumulator[key] = d
    elif d['conf'] > accumulator[key]['conf']:
        del accumulator[key]
        accumulator[key] = d
result = list(accumulator.values())

Result:

[{'name': 'bezel', 'conf': 0.67}, {'name': 'plate', 'conf': 0.69}]

Note that this is stable, i.e. the output preserves the order of the input.

Answered By: wjandrea
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.