python filter list of dictionaries based on key value

Question:

I have a list of dictionaries and each dictionary has a key of (let’s say) ‘type’ which can have values of 'type1', 'type2', etc. My goal is to filter out these dictionaries into a list of the same dictionaries but only the ones of a certain “type”. I think i’m just really struggling with list/dictionary comprehensions.

so an example list would look like:

exampleSet = [{'type':'type1'},{'type':'type2'},{'type':'type2'}, {'type':'type3'}]

i have a list of key values. lets say for example:

keyValList = ['type2','type3']

where the expected resulting list would look like:

expectedResult = [{'type':'type2'},{'type':'type2'},{'type':'type3'}]

I know i could do this with a set of for loops. I know there has to be a simpler way though. i found a lot of different flavors of this question but none that really fit the bill and answered the question. I would post an attempt at the answer… but they weren’t that impressive. probably best to leave it open ended. any assistance would be greatly appreciated.

Asked By: m25

||

Answers:

You can try a list comp

>>> exampleSet = [{'type':'type1'},{'type':'type2'},{'type':'type2'}, {'type':'type3'}]
>>> keyValList = ['type2','type3']
>>> expectedResult = [d for d in exampleSet if d['type'] in keyValList]
>>> expectedResult
[{'type': 'type2'}, {'type': 'type2'}, {'type': 'type3'}]

Another way is by using filter

>>> list(filter(lambda d: d['type'] in keyValList, exampleSet))
[{'type': 'type2'}, {'type': 'type2'}, {'type': 'type3'}]
Answered By: Bhargav Rao

Use filter, or if the number of dictionaries in exampleSet is too high, use ifilter of the itertools module. It would return an iterator, instead of filling up your system’s memory with the entire list at once:

from itertools import ifilter
for elem in ifilter(lambda x: x['type'] in keyValList, exampleSet):
    print elem
Answered By: Saksham Varma

This type of filtering is very easy to do in Pandas, especially as there are a lot of cases where lists of dictionaries work better as Pandas dataframes to begin with.

import pandas as pd

exampleSet = [{'type':'type1'}, {'type':'type2'}, {'type':'type2'}, {'type':'type3'}]
keyValList = ['type2', 'type3']

df = pd.DataFrame(my_list)
df[df['type'].isin(keyValList)]

results in:

    type
1   type2
2   type2
3   type3

and to get it back in dictionary form as desired by OP:

expectedResult = df[df['type'].isin(keyValList)].to_dict('records')
# the result will be [{'type': 'type2'}, {'type': 'type2'}, {'type': 'type3'}]
Answered By: wkzhu

Trying a few answers from this post, I tested the performance of each answer.

As my initial guess, the list comprehension is way faster, the filter and list method is second and the pandas is third, by far.

defined variables:

import pandas as pd

exampleSet = [{'type': 'type' + str(number)} for number in range(0, 1_000_000)]

keyValList = ['type21', 'type950000']


1st – list comprehension

%%timeit
expectedResult = [d for d in exampleSet if d['type'] in keyValList]

60.7 ms ± 188 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

2nd – filter and list

%%timeit
expectedResult = list(filter(lambda d: d['type'] in keyValList, exampleSet))

94 ms ± 328 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

3rd – pandas

%%timeit
df = pd.DataFrame(exampleSet)
expectedResult = df[df['type'].isin(keyValList)].to_dict('records')

336 ms ± 1.84 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


On a side note, using pandas to deal with a dict is not a great idea since the pandas.DataFrame is basically a more memory consuming dict and if you are not going to use a dataframe in the end it is just inefficient.

Answered By: kovashikawa

Universal approach to filter the list of dictionaries based on key-value pairs

def get_dic_filter_func(**kwargs):
    """Func to be used for map/filter function,
    returned func will take dict values from kwargs keys and compare resulted dict with kwargs"""
    def func(dic):
        dic_to_compare = {k: v for k, v in dic.items() if k in kwargs}
        return dic_to_compare == kwargs
    return func


def filter_list_of_dicts(list_of_dicts, **kwargs):
    """Filter list of dicts with key/value pairs
    in result will be added only dicts which has same key/value pairs as in kwargs """
    filter_func = get_dic_filter_func(**kwargs)
    return list(filter(filter_func, list_of_dicts))

Test Case / How to use

    def test_filter_list_of_dicts(self):
        dic1 = {'a': '1', 'b': 2}
        dic2 = {'a': 1, 'b': 3}
        dic3 = {'a': 2, 'b': 3}
        the_list = [dic1, dic2, dic3]

        self.assertEqual([], filter_list_of_dicts(the_list, x=1))
        self.assertEqual([dic1], filter_list_of_dicts(the_list, a='1'))
        self.assertEqual([dic2], filter_list_of_dicts(the_list, a=1))
        self.assertEqual([dic2, dic3], filter_list_of_dicts(the_list, b=3))
Answered By: pymen
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.