Python: remove dictionaries from a list which have same value for a key so that the values of that key are unique for all the dictionaries in the list

Question:

Problem

Say I have the following list of dictionaries:

givenValues=[
{'id': '0001', 'name': 'me'},
{'id': '0002', 'name': 'me'},
{'id': '0001', 'name': 'you'},
{'id': '0003', 'name': 'hi'},
{'id': '0001', 'name': 'they'},
{'id': '0002', 'name': 'me'},
{'id': '0002', 'name': 'me'}
]

Required result

I want to keep the first of each unique id and remove all other dictionaries from the list such that the result is

[
{'id': '0001', 'name': 'me'},
{'id': '0002', 'name': 'me'},
{'id': '0003', 'name': 'hi'}
]

So far I have tried the following. Some of the attempts do work if the dictionaries in the list are arranged differently but not always:

Attempt 1

tempList=[]
for i in range(len(givenValues)):
    for j in range(i+1, len(givenValues)):
        if givenValues[i]['id']==givenValues[j]['id']:
            tempList.append(givenValues[j])

for item in tempList:
    if item in givenValues:
        givenValues.remove(item)

Result:

[
{'id': '0001', 'name': 'me'},
{'id': '0003', 'name': 'hi'}
]

Attempt 2

for i in range(len(givenValues)):
    if i<len(givenValues):
        for j in range(i+1, len(givenValues)):
            if i<len(givenValues) and givenValues[i]['id']==givenValues[j]['id']:
                givenValues.remove(givenValues[j])

Result

[
{'id': '0001', 'name': 'me'},
{'id': '0003', 'name': 'hi'},
{'id': '0001', 'name': 'they'},
{'id': '0002', 'name': 'me'}
]

Please help me solve this problem.

Asked By: HimDek

||

Answers:

Here’a one possible solution:

data = [
    {"id": "0001", "name": "me"},
    {"id": "0002", "name": "me"},
    {"id": "0001", "name": "you"},
    {"id": "0003", "name": "hi"},
    {"id": "0001", "name": "they"},
    {"id": "0002", "name": "me"},
    {"id": "0002", "name": "me"},
]

selected = {}

for item in data:
    if item["id"] not in selected:
        selected[item["id"]] = item

output = list(selected.values())

print(output)

We use a dictionary to keep track of unique items, and we take advantage of the fact that a dictionary’s .values() method returns items in the order in which they were inserted (so this preserves the order in which we found the items in the original list).

The above code outputs:

[{'id': '0001', 'name': 'me'}, {'id': '0002', 'name': 'me'}, {'id': '0003', 'name': 'hi'}]

Here’s another way of approaching this, using itertools.groupby:

import itertools

data = [
    {"id": "0001", "name": "me"},
    {"id": "0002", "name": "me"},
    {"id": "0001", "name": "you"},
    {"id": "0003", "name": "hi"},
    {"id": "0001", "name": "they"},
    {"id": "0002", "name": "me"},
    {"id": "0002", "name": "me"},
]

output = []
for k, g in itertools.groupby(sorted(data, key=lambda item: item['id']), lambda item: item['id']):
    output.append(next(g))

print(output)

This produces the same output. It works by grouping the items in your list by id, and then taking the first item from each group.

Answered By: larsks
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.