Validate Pydantic dynamic float enum by name with OpenAPI description

Question:

Following on from this question and this discussion I am now trying to create a Pydantic BaseModel that has a field with a float Enum that is created dynamically and is validated by name. (Down the track I will probably want to use Decimal but for now I’m dealing with float.)

The discussion provides a solution to convert all Enums to validate by name, but I’m looking for how to do this for one or more individual fields, not a universal change to all Enums.

I consider this to be a common use case. The model uses an Enum which hides implementation details from the caller. The valid field values that a caller can supply are a limited list of names. These names are associated with internal values (in this case float) that the back-end wants to operate on, without requiring the caller to know them.

The Enum valid names and values do change dynamically and are loaded at run time but for the sake of clarity this would result in an Enum something like the following. Note that the Sex enum needs to be treated normally and validated and encoded by value, but the Factor enum needs to be validated by name:

from enum import Enum
from pydantic import BaseModel

class Sex(str, Enum):
    MALE = "M"
    FEMALE = "F"

class Factor(Enum):
    single = 1.0
    half = 0.4
    quarter = 0.1

class Model(BaseModel):
    sex: Sex
    factor: Factor
    class Config:
        json_encoders = {Factor: lambda field: field.name}

model = Model(sex="M", factor="half")
# Error: only accepts e.g. Model(sex="M", factor=0.4)

This is what I want but doesn’t work because the normal Pydantic Enum behaviour requires Model(factor=0.4), but my caller doesn’t know the particular float that’s in use right now for this factor, it can and should only provide "half". The code that manipulates the model internally always wants to refer to the float and so I expect it to have to use model.factor.value.

It’s fairly simple to create the Enum dynamically, but that doesn’t provide any Pydantic support for validating on name. It’s all automatically validated by value. So I think this is where most of the work is:

Factor = Enum("Factor", {"single": 1.0, "half": 0.4, "quarter": 0.1})

The standard way for Pydantic to customise serialization is with the json_encoders Config attribute. I’ve included that in the sample static Enum. That doesn’t seem to be problematic.

Finally, there needs to be support to provide the right description to the OpenAPI schema.

Actually, in my use-case I only need the Enum name/values to be dynamically established. So an implementation that modifies a declared Enum would work, as well as an implementation that creates the Enum type.

Asked By: NeilG

||

Answers:

Update (2023-03-03)

Class decorator solution

A convenient way to solve this is by creating a reusable decorator that adds both a __get_validators__ method and a __modify_schema__ method to any given Enum class. Both of these methods are documented here.

We can define a custom validator function that will be called for our decorated Enum classes, which will enforce that only names will be turned into members and actual members will pass validation.

The schema modifier will ensure that the JSON schema only shows the names as enum options.

from collections.abc import Callable, Iterator
from enum import EnumMeta
from typing import Any, Optional, TypeVar, cast

from pydantic.fields import ModelField

E = TypeVar("E", bound=EnumMeta)

def __modify_enum_schema__(
    field_schema: dict[str, Any],
    field: Optional[ModelField],
) -> None:
    if field is None:
        return
    field_schema["enum"] = list(cast(EnumMeta, field.type_).__members__.keys())

def __enum_name_validator__(v: Any, field: ModelField) -> Any:
    assert isinstance(field.type_, EnumMeta)
    if isinstance(v, field.type_):
        return v  # value is already an enum member
    try:
        return field.type_[v]  # get enum member by name
    except KeyError:
        raise ValueError(f"Invalid {field.type_.__name__} `{v}`")

def __get_enum_validators__() -> Iterator[Callable[..., Any]]:
    yield __enum_name_validator__

def validate_by_name(cls: E) -> E:
    setattr(cls, "__modify_schema__", __modify_enum_schema__)
    setattr(cls, "__get_validators__", __get_enum_validators__)
    return cls

Usage

from enum import Enum
from random import choices, random
from string import ascii_lowercase

from pydantic import BaseModel

# ... import validate_by_name


# Randomly generate an enum of floats:
_members = {
    name: round(random(), 1)
    for name in choices(ascii_lowercase, k=3)
}
Factor = Enum("Factor", _members)  # type: ignore[misc]
validate_by_name(Factor)
first_member = next(iter(Factor))
print("`Factor` members:", Factor.__members__)
print("First `Factor` member:", first_member)


class Foo(Enum):
    member_a = "a"
    member_b = "b"


@validate_by_name
class Bar(int, Enum):
    x = 1
    y = 2


class Model(BaseModel):
    factor: Factor
    foo: Foo
    bar: Bar

    class Config:
        json_encoders = {Factor: lambda field: field.name}


obj = Model.parse_obj({
    "factor": first_member.name,
    "foo": "a",
    "bar": "x",
})
print(obj.json(indent=4))
print(Model.schema_json(indent=4))

Example output:

`Factor` members: {'r': <Factor.r: 0.1>, 'j': <Factor.j: 0.9>, 'z': <Factor.z: 0.6>}
First `Factor` member: Factor.r
{
    "factor": "r",
    "foo": "a",
    "bar": 1
}
{
    "title": "Model",
    "type": "object",
    "properties": {
        "factor": {
            "$ref": "#/definitions/Factor"
        },
        "foo": {
            "$ref": "#/definitions/Foo"
        },
        "bar": {
            "$ref": "#/definitions/Bar"
        }
    },
    "required": [
        "factor",
        "foo",
        "bar"
    ],
    "definitions": {
        "Factor": {
            "title": "Factor",
            "description": "An enumeration.",
            "enum": [
                "r",
                "j",
                "z"
            ]
        },
        "Foo": {
            "title": "Foo",
            "description": "An enumeration.",
            "enum": [
                "a",
                "b"
            ]
        },
        "Bar": {
            "title": "Bar",
            "description": "An enumeration.",
            "enum": [
                "x",
                "y"
            ],
            "type": "integer"
        }
    }
}

This just demonstrates a few variations for this approach. As you can see, the Factor and Bar enums are both validated by name, whereas Foo is validated by value (as a regular Enum).

Since we defined a custom JSON Encoder for Factor, the factor value is exported/encoded as the name string, while both Foo and Bar are exported by value (as a regular Enum).

Both Factor and Bar display the enum names in their JSON schema, while Foo shows the enum values.

Note that the "type": "integer" for the JSON Schema of Bar is only present because I specified int as a explicit base class of Bar and disappears, if we remove that. To further ensure consistency, we could of course also simply add "type": "string" inside our __modify_enum_schema__ function.

The only thing that is seemingly impossible right now is to also somehow register our custom way of encoding those enums inside our decorator, so that we do not need to set it in the Config or pass the encoder argument to json explicitly. That may be possible with a few changes to the BaseModel logic, but I think this would be overkill.


Original answer

Validating Enum by name

The parsing part of your problem can be solved fairly easily with a custom validator.

Since a validator method can take the ModelField as an argument and that has the type_ attribute pointing to the type of the field, we can use that to try to coerce any value to a member of the corresponding Enum.

We can actually write a more or less generalized implementation that applies to any arbitrary Enum subtype fields. If we use the "*" argument for the validator, it will apply to all fields, but we also need to set pre=True to perform our checks before the default validators kick in:

from enum import Enum
from typing import Any

from pydantic import BaseModel, validator
from pydantic.fields import ModelField


class CustomBaseModel(BaseModel):
    @validator("*", pre=True)
    def coerce_to_enum_member(cls, v: Any, field: ModelField) -> Any:
        """For any `Enum` typed field, attempt to """
        type_ = field.type_
        if not (isinstance(type_, type) and issubclass(type_, Enum)):
            return v  # field is not an enum type
        if isinstance(v, type_):
            return v  # value is already an enum member
        try:
            return type_(v)  # get enum member by value
        except ValueError:
            try:
                return type_[v]  # get enum member by name
            except KeyError:
                raise ValueError(f"Invalid {type_.__name__} `{v}`")

That validator is agnostic of the specific Enum subtype and it should work for all of them because it uses the common EnumType API, such as EnumType.__getitem__ to get the member by name.

The nice thing about this approach is that while valid Enum names will be turned into the correct Enum members, passing a valid Enum value still works as it did before. As does passing the member directly.

Enum names in the JSON Schema

This is a bit more hacky, but not too bad.

Pydantic actually allows us to easily customize schema generation for specific fields. This is done by adding the __modify_schema__ classmethod to the type in question.

For Enum this turns out to be tricky, especially since you want to it to be created dynamically (via the Functional API). We cannot simply subclass Enum and add our modifier method there due to some magic around the EnumType. What we can do is simply monkey-patch it into Enum (or alternatively do that to our specific Enum subclasses).

Either way, this method again gives us all we need to replace the default "enum" schema section with an array of names instead of values:

from enum import Enum
from typing import Any, Optional

from pydantic.fields import ModelField


def __modify_enum_schema__(
    field_schema: dict[str, Any],
    field: Optional[ModelField],
) -> None:
    if field is None:
        return
    enum_cls = field.type_
    assert isinstance(enum_cls, type) and issubclass(enum_cls, Enum)
    field_schema["enum"] = list(enum_cls.__members__.keys())


# Monkey-patch `Enum` to customize schema modification:
Enum.__modify_schema__ = __modify_enum_schema__  # type: ignore[attr-defined]

And that is all we need. (Mypy will complain about the monkey-patching of course.)

Full demo

from enum import Enum
from random import choices, random
from string import ascii_lowercase
from typing import Any, Optional

from pydantic import BaseModel, validator
from pydantic.fields import ModelField


def __modify_enum_schema__(
    field_schema: dict[str, Any],
    field: Optional[ModelField],
) -> None:
    if field is None:
        return
    enum_cls = field.type_
    assert isinstance(enum_cls, type) and issubclass(enum_cls, Enum)
    field_schema["enum"] = list(enum_cls.__members__.keys())


# Monkey-patch `Enum` to customize schema modification:
Enum.__modify_schema__ = __modify_enum_schema__  # type: ignore[attr-defined]


class CustomBaseModel(BaseModel):
    @validator("*", pre=True)
    def coerce_to_enum_member(cls, v: Any, field: ModelField) -> Any:
        """For any `Enum` typed field, attempt to """
        type_ = field.type_
        if not (isinstance(type_, type) and issubclass(type_, Enum)):
            return v  # field is not an enum type
        if isinstance(v, type_):
            return v  # value is already an enum member
        try:
            return type_(v)  # get enum member by value
        except ValueError:
            try:
                return type_[v]  # get enum member by name
            except KeyError:
                raise ValueError(f"Invalid {type_.__name__} `{v}`")


# Randomly generate an enum of floats:
_members = {
    name: round(random(), 1)
    for name in choices(ascii_lowercase, k=3)
}
Factor = Enum("Factor", _members)  # type: ignore[misc]
first_member_name = next(iter(Factor)).name
print("Random `Factor` members:", Factor.__members__)
print("First member:", first_member_name)


class Model(CustomBaseModel):
    factor: Factor
    foo: str
    bar: int

    class Config:
        json_encoders = {Factor: lambda field: field.name}


obj = Model.parse_obj({
    "factor": first_member_name,
    "foo": "spam",
    "bar": -1,
})
print(obj.json(indent=4))
print(Model.schema_json(indent=4))

Output:

Random `Factor` members: {'a': <Factor.a: 0.9>, 'q': <Factor.q: 0.6>, 'e': <Factor.e: 0.8>}
First member: a
{
    "factor": "a",
    "foo": "spam",
    "bar": -1
}
{
    "title": "Model",
    "type": "object",
    "properties": {
        "factor": {
            "$ref": "#/definitions/Factor"
        },
        "foo": {
            "title": "Foo",
            "type": "string"
        },
        "bar": {
            "title": "Bar",
            "type": "integer"
        }
    },
    "required": [
        "factor",
        "foo",
        "bar"
    ],
    "definitions": {
        "Factor": {
            "title": "Factor",
            "description": "An enumeration.",
            "enum": [
                "a",
                "q",
                "e"
            ]
        }
    }
}

Notes

I chose this super weird way of randomly generating an Enum just for illustrative purposes. I wanted to show that both validation and schema generation still work fine in that case. But in practice I would assume that the names actually don’t change that drastically every time the program is run. (At least I hope they don’t for the sake of your users.)

The value of factor is still a regular Enum member, so obj.factor.value will still give us 0.9 (for this random example).

The validator will obviously prevent invalid names/values to be passed. You can make it more specific, if you like or restrict it to only deal with str arguments assuming them to be Enum member names and delegate the rest to Pydantic’s default validator. As it is written right now, it essentially replaces that default Enum validator.

Any other schema modifications (such as the description) can be done according to the docs I linked as well.

Answered By: Daniil Fajnberg

I’ve managed to almost complete my own answer to this question, using methods attached to the dynamic Enum to handle schema generation and validation, but there is still apparently a problem with JSON encoding.

I preferred to attach the custom processing to the type (Factor) because that is it’s logical home, given the modifications are all related to the type, not the model. This also keeps it DRY if the type is used in other models too. But the Pydantic model still needs to call the custom methods on the type, they don’t function on their own, so the point is a little moot, although this design still avoids code duplication.

The following code should run as-is, and accomplishes everything that is in the question, except Pydantic doesn’t seem to be respecting the json_encoders config with this set-up.

import types
from enum import Enum
from pydantic import BaseModel, ValidationError
import pytest


class Sex(str, Enum):
    """Normal Enum validated by value."""
    MALE = "M"
    FEMALE = "F"


def __modify_schema__(cls, schema):
    """Specify Enum names for schema for Factor enum."""
    schema["enum"] = list(cls.__members__.keys())
    schema["type"] = "string"


def __get_validators__(cls):
    """Validators for Factor enum."""
    yield cls._validate


def _validate(cls, value):
    """Validation for Factor enum by name, not value."""
    names = list(cls.__members__.keys())
    if value in names:
        return cls.__members__[value]
    raise ValueError(f"{value} is not a valid enumeration member for {cls.__name__}; permitted: {names}")


members = {"single": 1.0, "half": 0.4, "quarter": 0.1}
"""Change these members to create dynamic enum Factor."""
Factor = Enum("Factor", members, type=float)
Factor.__modify_schema__ = types.MethodType(__modify_schema__, Factor)
Factor._validate = types.MethodType(_validate, Factor)
Factor.__get_validators__ = types.MethodType(__get_validators__, Factor)


class Model(BaseModel):
    sex: Sex
    factor: Factor

    class Config:
        json_encoders = {Factor: lambda field: field.name}
        """Apparently the JSON encoder is not being called."""

model = Model(sex="M", factor="half")

# broken: assert model.json() == '{"sex": "M", "factor": "half"}'

assert model.schema() == {
    "title": "Model",
    "type": "object",
    "properties": {"sex": {"$ref": "#/definitions/Sex"}, "factor": {"$ref": "#/definitions/Factor"}},
    "required": ["sex", "factor"],
    "definitions": {
        "Sex": {"title": "Sex", "description": "An enumeration.", "enum": ["M", "F"], "type": "string"},
        "Factor": {
            "title": "Factor",
            "description": "An enumeration.",
            "enum": ["single", "half", "quarter"],
            "type": "string",
        },
    },
}


with pytest.raises(ValidationError) as excinfo:
    model = Model(sex="M", factor=1.0)
assert excinfo.value.errors()[0]["msg"].startswith("1.0 is not a valid enumeration member for Factor;")

with pytest.raises(ValidationError) as excinfo:
    model = Model(sex="MALE", factor="half")
assert excinfo.value.errors()[0]["msg"].startswith("value is not a valid enumeration member; permitted: 'M', 'F'")

I did try to subclass the dynamically created type Factor, in order to add the altered behaviour in a normally defined class, but it seems the dynamically created Enum doesn’t like that. Python says TypeError: ReFactor: cannot extend enumeration 'Factor' when attempting class ReFactor(Factor):.

As Daniil-Fajnberg says, there is also probably a solution by making the generic validator and other magic methods look for the specific enum, but I feel it’s a bit uglier to "zoom out" to the generic case and then have to check for the individual enum, rather than just implement it on the specific enum itself. Although I’m wondering now if at least that method will work with the json_encoders.

It took me some time to find out how to apply these magic methods to the dynamically created Enum, but now they’re working the json_encoders call isn’t. I’ve stepped through the call to model.json() but I can’t see where json_encoders is consulted. That’s the only part of this solution missing. If anyone can tell me why json_encoders has stopped working I’d be grateful.

Answered By: NeilG

As you can see from my other answer, I tried a slightly different approach from @DaniilFajnberg’s answer but it’s not quite complete.

I thought that my approach would be "nicer" because it focused the custom functions on the actual Enum type that was going to use them. However, as it turned out, it’s not that good. There are multiple custom dunder methods that have to be assigned manually to the dynamic Enum, and I think it’s even less compact than @Daniil’s. Whereas @Daniil’s method sticks to better documented features intended for the purpose, (even though it’s spread out over the BaseModel as well as the dynamic Enum), and it reads better.

Also, fairly crucially, the json_encoders call is still working in @Daniil’s set up whereas with my solution, for some unknown reason, the json_encoders call appears to have stopped working somewhere along the line; so the JSON is wrong.

I therefore accepted @Daniil’s answer. But @Daniil implemented his answer for the general case, that is, all and every Enum in the app will be "reversed" (validated by name instead of value). Whereas my requirement is for only the specific individual Enum to be customised. So I’m just going to show here a version of @Daniil’s answer that is cut back to work on just the one Enum, for the benefit of others (and myself), but I will still accept the original answer from @Daniil since he did that work.

from enum import Enum
from typing import Any, Optional

from pydantic import BaseModel, validator, ValidationError
from pydantic.fields import ModelField
import pytest


class Sex(str, Enum):
    MALE = "M"
    FEMALE = "F"


_members = {"single": 1.0, "half": 0.4, "quarter": 0.1}
"""Any dict of str, float pairs can be loaded from wherever at run time."""
Factor = Enum("Factor", _members)  # type: ignore[misc]
"""The Factor Enum is created dynamically."""


def __modify_factor_schema__(schema: dict[str, Any], field: Optional[ModelField]) -> None:
    """Schema modification is applied only to the specific Enum being customised."""
    schema["enum"] = list(field.type_.__members__.keys())
    schema["type"] = "string"

Factor.__modify_schema__ = __modify_factor_schema__  # type: ignore[attr-defined]



class Model(BaseModel):
    sex: Sex
    factor: Factor

    @validator("factor", pre=True)
    def validate_by_name(cls, value: Any, field: ModelField) -> Any:
        """Return Enum member by name instead of member value."""
        members = field.type_.__members__
        if value in members:
            return members[value]
        members = list(members.keys())
        raise ValueError(f"value is not a valid enumeration member for {field.type_.__name__}; permitted: {members}")

    class Config:
        json_encoders = {Factor: lambda field: field.name}


model = Model(sex="M", factor="half")

assert model.json() == '{"sex": "M", "factor": "half"}'

assert model.schema() == {
    "title": "Model",
    "type": "object",
    "properties": {"sex": {"$ref": "#/definitions/Sex"}, "factor": {"$ref": "#/definitions/Factor"}},
    "required": ["sex", "factor"],
    "definitions": {
        "Sex": {"title": "Sex", "description": "An enumeration.", "enum": ["M", "F"], "type": "string"},
        "Factor": {
            "title": "Factor",
            "description": "An enumeration.",
            "enum": ["single", "half", "quarter"],
            "type": "string",
        },
    },
}

with pytest.raises(ValidationError) as excinfo:
    model = Model(sex="M", factor=1.0)
assert excinfo.value.errors()[0]["msg"].startswith("value is not a valid enumeration member for Factor")

with pytest.raises(ValidationError) as excinfo:
    model = Model(sex="MALE", factor="half")
assert excinfo.value.errors()[0]["msg"].startswith("value is not a valid enumeration member; permitted: 'M', 'F'")
Answered By: NeilG