How to validate based on specific Enum member in a Fastapi Pydantic model

Question:

Here is my Pydantic model:

from enum import Enum
from pydantic import BaseModel


class ProfileField(str, Enum):
    mobile = "mobile"
    email = "email"
    address = "address"
    interests ="interests"  # need list of strings


class ProfileType(str, Enum):
    primary = "primary"
    secondary = "secondary"


class ProfileDetail(BaseModel):
    name: ProfileField
    value: str
    type: ProfileType

My API is accepting this type of JSON and its working fine.

{
    "data": [
        {
            "name": "email",
            "value": "[email protected]",
            "type": "primary"
        }
    ]
}

The requirement is email is string type and needs a regex, mobile is integer type and also needs a regex, and address is a string and needs to be restricted to 50 characters.

Is it possible to add corresponding validations?

Asked By: putta

||

Answers:

Discriminated union and built-in types/validators

If I understand correctly, the actual JSON data you receive has the top-level data key and its value is an array of objects that you currently represent with your ProfileDetail schema.

If that is the case, you may be better served by not using an Enum at all for your name field and instead defining a discriminated union based on the value of the name field. You can write a separate model for each case (mobile, email, and address) and delegate validation to each of them for their own case.

Since all three of them share a base schema, you can define a base model for them to inherit from to reduce repetition. The type field for example can stay an Enum (Pydantic handles validation of those out of the box) and can be inherited by the three submodels.

For mobile and address it sounds like you can just use constr to define your constraints via the regex and max_length parameters respectively.

For email, you can use the built-in Pydantic type EmailStr (subtype of str). You’ll just need to install the optional dependency with pip install 'pydantic[email]'.

That way you should not even need to write any custom validators.

Here is the setup I suggest:

from enum import Enum
from typing import Annotated, Literal, Union
from pydantic import BaseModel, EmailStr, Field, constr

class ProfileType(str, Enum):
    primary = "primary"
    secondary = "secondary"

class BaseProfileFieldData(BaseModel):
    value: str
    type: ProfileType

class MobileData(BaseProfileFieldData):
    value: constr(regex=r"d{5,}")  # your actual regex here
    name: Literal["mobile"]

class EmailData(BaseProfileFieldData):
    value: EmailStr
    name: Literal["email"]

class AddressData(BaseProfileFieldData):
    value: constr(max_length=50)
    name: Literal["address"]

ProfileField = Annotated[
    Union[MobileData, EmailData, AddressData],
    Field(discriminator="name")
]

class ProfileDetails(BaseModel):
    data: list[ProfileField]

Tests

Let’s test it with some fixtures:

test_data_mobile_valid = {
  "name": "mobile",
  "value": "123456",
  "type": "secondary",
}
test_data_mobile_invalid = {
  "name": "mobile",
  "value": "12",
  "type": "secondary",
}
test_data_email_valid = {
  "name": "email",
  "value": "[email protected]",
  "type": "primary",
}
test_data_email_invalid = {
  "name": "email",
  "value": "abcd@gmail@..",
  "type": "primary",
}
test_data_address_valid = {
  "name": "address",
  "value": "some street 42, 12345 example",
  "type": "secondary",
}
test_data_address_invalid = {
  "name": "address",
  "value": "x" * 51,
  "type": "secondary",
}
test_data_invalid_name = {
  "name": "foo",
  "value": "x",
  "type": "primary",
}
test_data_invalid_type = {
  "name": "mobile",
  "value": "123456",
  "type": "bar",
}

The first six should be self explanatory. test_data_invalid_name should cause an error because "foo" is not a valid discriminator value for name. test_data_invalid_type should demonstrate the built-in enum validator catching the invalid type value "bar".

Let’s test the valid data first:

if __name__ == "__main__":
    from pydantic import ValidationError

    obj = ProfileDetails.parse_obj({
        "data": [
            test_data_mobile_valid,
            test_data_email_valid,
            test_data_address_valid,
        ]
    })
    print(obj.json(indent=4))
    ...

Output:

{
    "data": [
        {
            "value": "123456",
            "type": "secondary",
            "name": "mobile"
        },
        {
            "value": "[email protected]",
            "type": "primary",
            "name": "email"
        },
        {
            "value": "some street 42, 12345 example",
            "type": "secondary",
            "name": "address"
        }
    ]
}

No surprises here. Now test those that should not pass the value validation:

if __name__ == "__main__":
    ...
    try:
        ProfileDetails.parse_obj({
            "data": [
                test_data_mobile_invalid,
                test_data_email_invalid,
                test_data_address_invalid,
            ]
        })
    except ValidationError as exc:
        print(exc.json(indent=4))
    ...

Output:

[
    {
        "loc": [
            "data",
            0,
            "MobileData",
            "value"
        ],
        "msg": "string does not match regex "\d{5,}"",
        "type": "value_error.str.regex",
        "ctx": {
            "pattern": "\d{5,}"
        }
    },
    {
        "loc": [
            "data",
            1,
            "EmailData",
            "value"
        ],
        "msg": "value is not a valid email address",
        "type": "value_error.email"
    },
    {
        "loc": [
            "data",
            2,
            "AddressData",
            "value"
        ],
        "msg": "ensure this value has at most 50 characters",
        "type": "value_error.any_str.max_length",
        "ctx": {
            "limit_value": 50
        }
    }
]

Caught all the wrong values. Now just to be sure, the last two fixtures:

if __name__ == "__main__":
    ...
    try:
        ProfileDetails.parse_obj({
            "data": [
                test_data_invalid_name,
                test_data_invalid_type,
            ]
        })
    except ValidationError as exc:
        print(exc.json(indent=4))

Output:

[
    {
        "loc": [
            "data",
            0
        ],
        "msg": "No match for discriminator 'name' and value 'foo' (allowed values: 'mobile', 'email', 'address')",
        "type": "value_error.discriminated_union.invalid_discriminator",
        "ctx": {
            "discriminator_key": "name",
            "discriminator_value": "foo",
            "allowed_values": "'mobile', 'email', 'address'"
        }
    },
    {
        "loc": [
            "data",
            1,
            "MobileData",
            "type"
        ],
        "msg": "value is not a valid enumeration member; permitted: 'primary', 'secondary'",
        "type": "type_error.enum",
        "ctx": {
            "enum_values": [
                "primary",
                "secondary"
            ]
        }
    }
]

Seems like we get the desired behavior from our model.


Caveat

If you really want a separate model like the ProfileDetail you showed in your question, that will not be possible with discriminated unions because those rely on being defined for a field on a separate model. In that case you’ll actually have to write a custom validator (probably a root_validator) to ensure consistency between name and value.

Answered By: Daniil Fajnberg