filter items in a python dictionary where keys contain a specific string

Question:

I’m a C coder developing something in python. I know how to do the following in C (and hence in C-like logic applied to python), but I’m wondering what the ‘Python’ way of doing it is.

I have a dictionary d, and I’d like to operate on a subset of the items, only those whose key (string) contains a specific substring.

i.e. the C logic would be:

for key in d:
    if filter_string in key:
        # do something
    else
        # do nothing, continue

I’m imagining the python version would be something like

filtered_dict = crazy_python_syntax(d, substring)
for key,value in filtered_dict.iteritems():
    # do something

I’ve found a lot of posts on here regarding filtering dictionaries, but couldn’t find one which involved exactly this.

My dictionary is not nested and i’m using python 2.7

Asked By: memo

||

Answers:

How about a dict comprehension:

filtered_dict = {k:v for k,v in d.iteritems() if filter_string in k}

One you see it, it should be self-explanatory, as it reads like English pretty well.

This syntax requires Python 2.7 or greater.

In Python 3, there is only dict.items(), not iteritems() so you would use:

filtered_dict = {k:v for (k,v) in d.items() if filter_string in k}
Answered By: Jonathon Reinhart
input = {"A":"a", "B":"b", "C":"c"}
output = {k:v for (k,v) in input.items() if key_satifies_condition(k)}
Answered By: jspurim

Jonathon gave you an approach using dict comprehensions in his answer. Here is an approach that deals with your do something part.

If you want to do something with the values of the dictionary, you don’t need a dictionary comprehension at all:

I’m using iteritems() since you tagged your question with

results = map(some_function, [(k,v) for k,v in a_dict.iteritems() if 'foo' in k])

Now the result will be in a list with some_function applied to each key/value pair of the dictionary, that has foo in its key.

If you just want to deal with the values and ignore the keys, just change the list comprehension:

results = map(some_function, [v for k,v in a_dict.iteritems() if 'foo' in k])

some_function can be any callable, so a lambda would work as well:

results = map(lambda x: x*2, [v for k,v in a_dict.iteritems() if 'foo' in k])

The inner list is actually not required, as you can pass a generator expression to map as well:

>>> map(lambda a: a[0]*a[1], ((k,v) for k,v in {2:2, 3:2}.iteritems() if k == 2))
[4]
Answered By: Burhan Khalid

Go for whatever is most readable and easily maintainable. Just because you can write it out in a single line doesn’t mean that you should. Your existing solution is close to what I would use other than I would user iteritems to skip the value lookup, and I hate nested ifs if I can avoid them:

for key, val in d.iteritems():
    if filter_string not in key:
        continue
    # do something

However if you realllly want something to let you iterate through a filtered dict then I would not do the two step process of building the filtered dict and then iterating through it, but instead use a generator, because what is more pythonic (and awesome) than a generator?

First we create our generator, and good design dictates that we make it abstract enough to be reusable:

# The implementation of my generator may look vaguely familiar, no?
def filter_dict(d, filter_string):
    for key, val in d.iteritems():
        if filter_string not in key:
            continue
        yield key, val

And then we can use the generator to solve your problem nice and cleanly with simple, understandable code:

for key, val in filter_dict(d, some_string):
    # do something

In short: generators are awesome.

Answered By: Brendan F

You can use the built-in filter function to filter dictionaries, lists, etc. based on specific conditions.

filtered_dict = dict(filter(lambda item: filter_str in item[0], d.items()))

The advantage is that you can use it for different data structures.

Answered By: Pulkit

You can use the built-in function ‘filter()’:

data = {'aaa':12, 'bbb':23, 'ccc':8, 'ddd':34}

# filter by key
print(dict(filter(lambda e:e[0]=='bbb', data.items() ) ) )

# filter by value
print(dict(filter(lambda e:e[1]>18, data.items() ) ) )

OUTPUT:

{'bbb':23}

{'bbb':23, 'ddd':34}
Answered By: Devalla Bharath