How do I check that multiple keys are in a dict in a single pass?
Question:
I want to do something like:
foo = {
'foo': 1,
'zip': 2,
'zam': 3,
'bar': 4
}
if ("foo", "bar") in foo:
#do stuff
How do I check whether both foo
and bar
are in dict foo
?
Answers:
This should work:
if all(key in foo for key in ["foo","bar"]):
# do stuff
pass
Hint:
Using square brackets inside all()
to make a list comprehension:
if all([key in foo for key in ["foo","bar"]]):
Is not only unnecessary, but it is positively harmful, as they impede the normal short-circuiting behavior of all()
.
Well, you could do this:
>>> if all(k in foo for k in ("foo","bar")):
... print "They're there!"
...
They're there!
How about using lambda?
if reduce( (lambda x, y: x and foo.has_key(y) ), [ True, "foo", "bar"] ): # do stuff
if {"foo", "bar"} <= myDict.keys(): ...
If you’re still on Python 2, you can do
if {"foo", "bar"} <= myDict.viewkeys(): ...
If you’re still on a really old Python <= 2.6, you can call set
on the dict, but it’ll iterate over the whole dict to build the set, and that’s slow:
if set(("foo", "bar")) <= set(myDict): ...
Using sets:
if set(("foo", "bar")).issubset(foo):
#do stuff
Alternatively:
if set(("foo", "bar")) <= set(foo):
#do stuff
In case you want to:
- also get the values for the keys
- check more than one dictonary
then:
from operator import itemgetter
foo = {'foo':1,'zip':2,'zam':3,'bar':4}
keys = ("foo","bar")
getter = itemgetter(*keys) # returns all values
try:
values = getter(foo)
except KeyError:
# not both keys exist
pass
Not to suggest that this isn’t something that you haven’t thought of, but I find that the simplest thing is usually the best:
if ("foo" in foo) and ("bar" in foo):
# do stuff
>>> if 'foo' in foo and 'bar' in foo:
... print 'yes'
...
yes
Jason, () aren’t necessary in Python.
Alex Martelli’s solution set(queries) <= set(my_dict)
is the shortest code but may not be the fastest. Assume Q = len(queries) and D = len(my_dict).
This takes O(Q) + O(D) to make the two sets, and then (one hopes!) only O(min(Q,D)) to do the subset test — assuming of course that Python set look-up is O(1) — this is worst case (when the answer is True).
The generator solution of hughdbrown (et al?) all(k in my_dict for k in queries)
is worst-case O(Q).
Complicating factors:
(1) the loops in the set-based gadget are all done at C-speed whereas the any-based gadget is looping over bytecode.
(2) The caller of the any-based gadget may be able to use any knowledge of probability of failure to order the query items accordingly whereas the set-based gadget allows no such control.
As always, if speed is important, benchmarking under operational conditions is a good idea.
Simple benchmarking rig for 3 of the alternatives.
Put in your own values for D and Q
>>> from timeit import Timer
>>> setup='''from random import randint as R;d=dict((str(R(0,1000000)),R(0,1000000)) for i in range(D));q=dict((str(R(0,1000000)),R(0,1000000)) for i in range(Q));print("looking for %s items in %s"%(len(q),len(d)))'''
>>> Timer('set(q) <= set(d)','D=1000000;Q=100;'+setup).timeit(1)
looking for 100 items in 632499
0.28672504425048828
#This one only works for Python3
>>> Timer('set(q) <= d.keys()','D=1000000;Q=100;'+setup).timeit(1)
looking for 100 items in 632084
2.5987625122070312e-05
>>> Timer('all(k in d for k in q)','D=1000000;Q=100;'+setup).timeit(1)
looking for 100 items in 632219
1.1920928955078125e-05
You don’t have to wrap the left side in a set. You can just do this:
if {'foo', 'bar'} <= set(some_dict):
pass
This also performs better than the all(k in d...)
solution.
>>> ok
{'five': '5', 'two': '2', 'one': '1'}
>>> if ('two' and 'one' and 'five') in ok:
... print "cool"
...
cool
This seems to work
While I like Alex Martelli’s answer, it doesn’t seem Pythonic to me. That is, I thought an important part of being Pythonic is to be easily understandable. With that goal, <=
isn’t easy to understand.
While it’s more characters, using issubset()
as suggested by Karl Voigtland’s answer is more understandable. Since that method can use a dictionary as an argument, a short, understandable solution is:
foo = {'foo': 1, 'zip': 2, 'zam': 3, 'bar': 4}
if set(('foo', 'bar')).issubset(foo):
#do stuff
I’d like to use {'foo', 'bar'}
in place of set(('foo', 'bar'))
, because it’s shorter. However, it’s not that understandable and I think the braces are too easily confused as being a dictionary.
I think this is the smartest and pithonic.
{'key1','key2'} <= my_dict.keys()
You can use .issubset() as well
>>> {"key1", "key2"}.issubset({"key1":1, "key2":2, "key3": 3})
True
>>> {"key4", "key2"}.issubset({"key1":1, "key2":2, "key3": 3})
False
>>>
Just my take on this, there are two methods that are easy to understand of all the given options. So my main criteria is have very readable code, not exceptionally fast code. To keep code understandable, i prefer to given possibilities:
- var <= var2.keys()
- var.issubset(var2)
The fact that “var <= var2.keys()” executes faster in my testing below, i prefer this one.
import timeit
timeit.timeit('var <= var2.keys()', setup='var={"managed_ip", "hostname", "fqdn"}; var2= {"zone": "test-domain1.var23.com", "hostname": "bakje", "api_client_ip": "127.0.0.1", "request_data": "", "request_method": "GET", "request_url": "hvar2p://127.0.0.1:5000/test-domain1.var23.com/bakje", "utc_datetime": "04-Apr-2019 07:01:10", "fqdn": "bakje.test-domain1.var23.com"}; var={"managed_ip", "hostname", "fqdn"}')
0.1745898080000643
timeit.timeit('var.issubset(var2)', setup='var={"managed_ip", "hostname", "fqdn"}; var2= {"zone": "test-domain1.var23.com", "hostname": "bakje", "api_client_ip": "127.0.0.1", "request_data": "", "request_method": "GET", "request_url": "hvar2p://127.0.0.1:5000/test-domain1.var23.com/bakje", "utc_datetime": "04-Apr-2019 07:01:10", "fqdn": "bakje.test-domain1.var23.com"}; var={"managed_ip", "hostname", "fqdn"};')
0.2644960229999924
In the case of determining whether only some keys match, this works:
any_keys_i_seek = ["key1", "key2", "key3"]
if set(my_dict).intersection(any_keys_i_seek):
# code_here
pass
Yet another option to find if only some keys match:
any_keys_i_seek = ["key1", "key2", "key3"]
if any_keys_i_seek & my_dict.keys():
# code_here
pass
Another option for detecting whether all keys are in a dict:
dict_to_test = { ... } # dict
keys_sought = { "key_sought_1", "key_sought_2", "key_sought_3" } # set
if keys_sought & dict_to_test.keys() == keys_sought:
# True -- dict_to_test contains all keys in keys_sought
# code_here
pass
check for existence of all keys in a dict:
{'key_1', 'key_2', 'key_3'} <= set(my_dict)
check for existence of one or more keys in a dict:
{'key_1', 'key_2', 'key_3'} & set(my_dict)
short and sweet
{"key1", "key2"} <= {*dict_name}
Here’s an alternative solution in case you want to get the items that didn’t match…
not_existing_keys = [item for item in ["foo","bar"] if item not in foo]
if not_existing_keys:
log.error('These items are missing', not_existing_keys)
my_dict = {
'name': 'Askavy',
'country': 'India',
'age': 30
}
if set(('name', 'country','age')).issubset(my_dict.keys()):
print("All keys are present in the dictionary")
else:
print("All keys are not present in the dictionary")
To me, simple and easy with None key in the middle with pydash
ref
import pydash as _
_.get(d, 'key1.key2.key3.whatevermaybeNone.inthemiddle', default=None) )
I want to do something like:
foo = {
'foo': 1,
'zip': 2,
'zam': 3,
'bar': 4
}
if ("foo", "bar") in foo:
#do stuff
How do I check whether both foo
and bar
are in dict foo
?
This should work:
if all(key in foo for key in ["foo","bar"]):
# do stuff
pass
Hint:
Using square brackets inside all()
to make a list comprehension:
if all([key in foo for key in ["foo","bar"]]):
Is not only unnecessary, but it is positively harmful, as they impede the normal short-circuiting behavior of all()
.
Well, you could do this:
>>> if all(k in foo for k in ("foo","bar")):
... print "They're there!"
...
They're there!
How about using lambda?
if reduce( (lambda x, y: x and foo.has_key(y) ), [ True, "foo", "bar"] ): # do stuff
if {"foo", "bar"} <= myDict.keys(): ...
If you’re still on Python 2, you can do
if {"foo", "bar"} <= myDict.viewkeys(): ...
If you’re still on a really old Python <= 2.6, you can call set
on the dict, but it’ll iterate over the whole dict to build the set, and that’s slow:
if set(("foo", "bar")) <= set(myDict): ...
Using sets:
if set(("foo", "bar")).issubset(foo):
#do stuff
Alternatively:
if set(("foo", "bar")) <= set(foo):
#do stuff
In case you want to:
- also get the values for the keys
- check more than one dictonary
then:
from operator import itemgetter
foo = {'foo':1,'zip':2,'zam':3,'bar':4}
keys = ("foo","bar")
getter = itemgetter(*keys) # returns all values
try:
values = getter(foo)
except KeyError:
# not both keys exist
pass
Not to suggest that this isn’t something that you haven’t thought of, but I find that the simplest thing is usually the best:
if ("foo" in foo) and ("bar" in foo):
# do stuff
>>> if 'foo' in foo and 'bar' in foo:
... print 'yes'
...
yes
Jason, () aren’t necessary in Python.
Alex Martelli’s solution set(queries) <= set(my_dict)
is the shortest code but may not be the fastest. Assume Q = len(queries) and D = len(my_dict).
This takes O(Q) + O(D) to make the two sets, and then (one hopes!) only O(min(Q,D)) to do the subset test — assuming of course that Python set look-up is O(1) — this is worst case (when the answer is True).
The generator solution of hughdbrown (et al?) all(k in my_dict for k in queries)
is worst-case O(Q).
Complicating factors:
(1) the loops in the set-based gadget are all done at C-speed whereas the any-based gadget is looping over bytecode.
(2) The caller of the any-based gadget may be able to use any knowledge of probability of failure to order the query items accordingly whereas the set-based gadget allows no such control.
As always, if speed is important, benchmarking under operational conditions is a good idea.
Simple benchmarking rig for 3 of the alternatives.
Put in your own values for D and Q
>>> from timeit import Timer
>>> setup='''from random import randint as R;d=dict((str(R(0,1000000)),R(0,1000000)) for i in range(D));q=dict((str(R(0,1000000)),R(0,1000000)) for i in range(Q));print("looking for %s items in %s"%(len(q),len(d)))'''
>>> Timer('set(q) <= set(d)','D=1000000;Q=100;'+setup).timeit(1)
looking for 100 items in 632499
0.28672504425048828
#This one only works for Python3
>>> Timer('set(q) <= d.keys()','D=1000000;Q=100;'+setup).timeit(1)
looking for 100 items in 632084
2.5987625122070312e-05
>>> Timer('all(k in d for k in q)','D=1000000;Q=100;'+setup).timeit(1)
looking for 100 items in 632219
1.1920928955078125e-05
You don’t have to wrap the left side in a set. You can just do this:
if {'foo', 'bar'} <= set(some_dict):
pass
This also performs better than the all(k in d...)
solution.
>>> ok
{'five': '5', 'two': '2', 'one': '1'}
>>> if ('two' and 'one' and 'five') in ok:
... print "cool"
...
cool
This seems to work
While I like Alex Martelli’s answer, it doesn’t seem Pythonic to me. That is, I thought an important part of being Pythonic is to be easily understandable. With that goal, <=
isn’t easy to understand.
While it’s more characters, using issubset()
as suggested by Karl Voigtland’s answer is more understandable. Since that method can use a dictionary as an argument, a short, understandable solution is:
foo = {'foo': 1, 'zip': 2, 'zam': 3, 'bar': 4}
if set(('foo', 'bar')).issubset(foo):
#do stuff
I’d like to use {'foo', 'bar'}
in place of set(('foo', 'bar'))
, because it’s shorter. However, it’s not that understandable and I think the braces are too easily confused as being a dictionary.
I think this is the smartest and pithonic.
{'key1','key2'} <= my_dict.keys()
You can use .issubset() as well
>>> {"key1", "key2"}.issubset({"key1":1, "key2":2, "key3": 3})
True
>>> {"key4", "key2"}.issubset({"key1":1, "key2":2, "key3": 3})
False
>>>
Just my take on this, there are two methods that are easy to understand of all the given options. So my main criteria is have very readable code, not exceptionally fast code. To keep code understandable, i prefer to given possibilities:
- var <= var2.keys()
- var.issubset(var2)
The fact that “var <= var2.keys()” executes faster in my testing below, i prefer this one.
import timeit
timeit.timeit('var <= var2.keys()', setup='var={"managed_ip", "hostname", "fqdn"}; var2= {"zone": "test-domain1.var23.com", "hostname": "bakje", "api_client_ip": "127.0.0.1", "request_data": "", "request_method": "GET", "request_url": "hvar2p://127.0.0.1:5000/test-domain1.var23.com/bakje", "utc_datetime": "04-Apr-2019 07:01:10", "fqdn": "bakje.test-domain1.var23.com"}; var={"managed_ip", "hostname", "fqdn"}')
0.1745898080000643
timeit.timeit('var.issubset(var2)', setup='var={"managed_ip", "hostname", "fqdn"}; var2= {"zone": "test-domain1.var23.com", "hostname": "bakje", "api_client_ip": "127.0.0.1", "request_data": "", "request_method": "GET", "request_url": "hvar2p://127.0.0.1:5000/test-domain1.var23.com/bakje", "utc_datetime": "04-Apr-2019 07:01:10", "fqdn": "bakje.test-domain1.var23.com"}; var={"managed_ip", "hostname", "fqdn"};')
0.2644960229999924
In the case of determining whether only some keys match, this works:
any_keys_i_seek = ["key1", "key2", "key3"]
if set(my_dict).intersection(any_keys_i_seek):
# code_here
pass
Yet another option to find if only some keys match:
any_keys_i_seek = ["key1", "key2", "key3"]
if any_keys_i_seek & my_dict.keys():
# code_here
pass
Another option for detecting whether all keys are in a dict:
dict_to_test = { ... } # dict
keys_sought = { "key_sought_1", "key_sought_2", "key_sought_3" } # set
if keys_sought & dict_to_test.keys() == keys_sought:
# True -- dict_to_test contains all keys in keys_sought
# code_here
pass
check for existence of all keys in a dict:
{'key_1', 'key_2', 'key_3'} <= set(my_dict)
check for existence of one or more keys in a dict:
{'key_1', 'key_2', 'key_3'} & set(my_dict)
short and sweet
{"key1", "key2"} <= {*dict_name}
Here’s an alternative solution in case you want to get the items that didn’t match…
not_existing_keys = [item for item in ["foo","bar"] if item not in foo]
if not_existing_keys:
log.error('These items are missing', not_existing_keys)
my_dict = {
'name': 'Askavy',
'country': 'India',
'age': 30
}
if set(('name', 'country','age')).issubset(my_dict.keys()):
print("All keys are present in the dictionary")
else:
print("All keys are not present in the dictionary")
To me, simple and easy with None key in the middle with pydash
ref
import pydash as _
_.get(d, 'key1.key2.key3.whatevermaybeNone.inthemiddle', default=None) )