python remove duplicate dictionaries from a list
Question:
I have a list of dictionaries
l = [
{'firstname': 'joe', 'surname': 'bloggs'},
{'firstname': 'john', 'surname': 'smith'},
{'firstname': 'joe', 'surname': 'bloggs'},
{'firstname': 'jane', 'surname': 'bloggs'}
]
how do i remove duplicates i.e. {'firstname': 'joe', 'surname': 'bloggs'}
appears twice so would want it only appearing once?
Answers:
Something like this should do the stuff :
result = [dict(tupleized) for tupleized in set(tuple(item.items()) for item in l)]
first, I transform the inital dict in a list of tuples, then I put them into a set (that removes duplicates entries), and then back into a dict.
import itertools
import operator
from operator import itemgetter
import pprint
l = [
{'firstname': 'joe', 'surname': 'bloggs'},
{'firstname': 'john', 'surname': 'smith'},
{'firstname': 'joe', 'surname': 'bloggs'},
{'firstname': 'jane', 'surname': 'bloggs'}
]
getvals = operator.itemgetter('firstname', 'surname')
l.sort(key=getvals)
result = []
for k, g in itertools.groupby(l, getvals):
result.append(g.next())
l[:] = result
pprint.pprint(l)
I have a list of dictionaries
l = [
{'firstname': 'joe', 'surname': 'bloggs'},
{'firstname': 'john', 'surname': 'smith'},
{'firstname': 'joe', 'surname': 'bloggs'},
{'firstname': 'jane', 'surname': 'bloggs'}
]
how do i remove duplicates i.e. {'firstname': 'joe', 'surname': 'bloggs'}
appears twice so would want it only appearing once?
Something like this should do the stuff :
result = [dict(tupleized) for tupleized in set(tuple(item.items()) for item in l)]
first, I transform the inital dict in a list of tuples, then I put them into a set (that removes duplicates entries), and then back into a dict.
import itertools
import operator
from operator import itemgetter
import pprint
l = [
{'firstname': 'joe', 'surname': 'bloggs'},
{'firstname': 'john', 'surname': 'smith'},
{'firstname': 'joe', 'surname': 'bloggs'},
{'firstname': 'jane', 'surname': 'bloggs'}
]
getvals = operator.itemgetter('firstname', 'surname')
l.sort(key=getvals)
result = []
for k, g in itertools.groupby(l, getvals):
result.append(g.next())
l[:] = result
pprint.pprint(l)