Is it possible to automatically convert a Union type to only one type automatically with pydantic?

Question:

Given the following data model:


class Demo(BaseModel):
  id: Union[int, str]
  files: Union[str, List[str]]

Is there a way to tell pydantic to always convert id to str type and files to List[str] type automatically when I access them, instead of doing this manually every time.

Asked By: link89

||

Answers:

I figure out how to make it after get help from the maintainer. The key point is to remove Union from the type definition and use a pre-process hook to convert the value before validation, here is the sample code:

from pydantic import BaseModel, validator
from typing import List

class Demo(BaseModel):
    id: str
    files: List[str]

    @validator('id', pre=True)
    def id_must_be_str(cls, v):
        if isinstance(v, int):
            v = str(v)
        return v

    @validator('files', pre=True)
    def files_must_be_list_of_str(cls, v):
        if isinstance(v, str):
            v = [v]
        return v


obj = Demo.parse_obj({'id': 1, 'files': '/data/1.txt'})

print(type(obj.id))
print(type(obj.files))

Answered By: link89

Pydantic has built-in validation logic built-in for most of the common types out there. This includes str. It just so happens that the default string validator simply coerces values of type int, float or Decimal to str by default. (see str_validator source)

This means even if you annotate id as str, but pass an int value, the model will initialize properly without validation error and the id value will be the str version of that value. (e.g. str(42) gives "42")

list also has a default validator built-in, but in this case it may be not what you want. If it encounters a non-list value, but sees that it is a sequence (or a generator), it again coerces it to a list. (see list_validator source) In this case, since the value you might pass to it will be a str and a str is a sequence, the outcome would be a list of single-character strings from the initial string. (e.g. list("abc") gives ["a", "b", "c"])

So for list[str] you will likely need your own custom pre=True validator to perform whatever you deem necessary with the str value to turn it into a list[str].

Example:

from pydantic import BaseModel, validator


class Demo(BaseModel):
    id: str
    files: list[str]

    @validator("files", pre=True)
    def str_to_list_of_str(cls, v: object) -> object:
        if isinstance(v, str):
            return v.split(",")
        return v


if __name__ == "__main__":
    obj = Demo.parse_obj({"id": 42, "files": "foo,bar,baz"})
    print(obj)
    print(type(obj.id), type(obj.files))

Output:

id='42' files=['foo', 'bar', 'baz']
<class 'str'> <class 'list'>

As you can see, you don’t even need any additional id field logic, if your values are int because they end up as str on the model instance.

Answered By: Daniil Fajnberg
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.