Checking dict keys to ensure a required key always exists, and that the dict has no other key names beyond a defined set of names

Question:

I have a dict in python that follows this general format:

{'field': ['$.name'], 'group': 'name', 'function': 'some_function'}

I want to do some pre-check of the dict to ensure that ‘field’ always exists, and that no more keys exist beyond ‘group’ and ‘function’ which are both optional.

I know I can do this by using a long and untidy if statement, but I’m thinking there must be a cleaner way?

This is what I currently have:

if (('field' in dict_name and len(dict_name.keys()) == 1) or 
    ('group' in dict_name and len(dict_name.keys()) == 2) or 
    ('function' in dict_name and len(dict_name.keys()) == 2) or 
    ('group' in dict_name and 'function' in dict_name and len(dict_name.keys()) == 3))

Essentially I’m first checking if ‘field’ exists as this is required. I’m then checking to see if it is the only key (which is fine) or if it is a key alongside ‘group’ and no others, or a key alongside ‘function’ and no others or a key alongside both ‘group’ and ‘function’ and no others.

Is there a tidier way of checking the keys supplied are only these 3 keys where two are optional?

Asked By: nimgwfc

||

Answers:

You can use a set of allowed keys and check the dict’s keys using the built-in all() function:

allowed_keys = {'field', 'group', 'function'}

if 'field' in d and all(key in allowed_keys for key in d):
    pass

This is actually equivalent to the set‘s issubset() method:

Test whether every element in the set is in other.

So this can become even more compact by casting the dict’s keys to a set and checking that they are a sub-set of the allowed keys:

allowed_keys = {'field', 'group', 'function'}

if 'field' in d and set(d).issubset(allowed_keys):
    pass
Answered By: Tomerikoo

As far as I’m concerned you want to check, that

  1. The set {'field'} is always contained in the set of your dict keys
  2. The set of your dict keys is always contained in the set {'field', 'group', 'function'}

So just code it!

required_fields = {'field'}
allowed_fields = required_fields | {'group', 'function'}

d = {'field': 123}  # Set any value here

if required_fields <= d.keys() <= allowed_fields:
    print("Yes!")
else:
    print("No!")

This solution is scalable for any sets of required and allowed fields unless you have some special conditions (for example, mutually exclusive keys)

(thanks to @Duncan for a very elegant code reduction)

Answered By: Kolay.Ne

dict.keys returns a set-like view backed by the original data. You can take advantage of that to write a very concise test:

allowed = {'field', 'group', 'function'}

if 'field' in dict_name and dict_name.keys() <= allowed:
    ...

set operator <= is equivalent to the issubset method.

You can use other set operations for the second condition:

allowed >= dict_name.keys()
len(dict_name.keys() | allowed) <= len(allowed)
not (dict_name.keys() - allowed)

Aside from readability, using operators is the only possibility when using a keys view in some cases. For example, the following fails to run:

dict_name.keys().issubset(allowed)

But the following works just fine:

dict_name.keys() <= allowed

You can do

allowed.issuperset(dict_name.keys())

But that’s likely to wrap dict_name.keys() in an unnecessary set object. At the same time,

allowed >= dict_name.keys()

Will actually flip the operator and use the <= version because of how Python’s arithmetic operators resolve types.

Answered By: Mad Physicist

You can also use validation packages like schema https://pypi.org/project/schema/

from schema import Schema, And

my_schema = Schema({
    'field': And(str, len),
    'group': And(str, len),
    'function': And(str, len)
})

data = {
    'field': 'Hello',
    'group': 'This is a group',
    'function': 'some_function'
}

my_schema.validate(data)
Answered By: Juha Untinen

Yes, by converting your dict with a dataclass:

from typing import List, Optional
from dataclasses import dataclass

@dataclass
class MyDataclass:
     field: List[str]
     group: Optional[str] = None
     function: Optional[str] = None

result = MyDataclass(["$.name"], "name", "some_function")
# or, equivalently:
result = MyDataclass(field=["$.name"], group="name", function="some_function")

# access with result.field, result.group, result.function

To answer your question directly, you can write the following, and it will throw an exception when a field is missing from the input dictionary:

dict_name = {'field': ['$.name'], 'group': 'name', 'function': 'some_function'}

MyDataclass(*dict_name)

Note that the above only works when your keys are strings, due to the use of the splat operator. (*)

Once converted to a dataclass, you can safely use it assured that it has the fields. This is less prone to errors, because it prevents you from mixing up a dict checked for missing parameters and an unchecked dict in different parts of your code. See Parse, Don’t Validate for a full explanation from a theoretical standpoint.

Dataclasses are the idiomatic way to do it in Python, similar to how objects (dictionaries) are the idiomatic way to do it in JavaScript. In addition, if you’re using an IDE with mypy/pyre/PEP 484 support, you will get type hints on objects. Thanks to the bidirectionality of PEP 484, that means if you create a dict with a missing field, and pass it to a function that converts it to a dataclass, the type checker may be able to check the error.

You can convert a dataclass back to a dict using dataclasses.asdict.

Another option is namedtuple.

Answered By: noɥʇʎԀʎzɐɹƆ
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.