Is it possible to automatically convert a Union type to only one type automatically with pydantic?
Question:
Given the following data model:
class Demo(BaseModel):
id: Union[int, str]
files: Union[str, List[str]]
Is there a way to tell pydantic
to always convert id
to str
type and files
to List[str]
type automatically when I access them, instead of doing this manually every time.
Answers:
I figure out how to make it after get help from the maintainer. The key point is to remove Union
from the type definition and use a pre-process hook to convert the value before validation, here is the sample code:
from pydantic import BaseModel, validator
from typing import List
class Demo(BaseModel):
id: str
files: List[str]
@validator('id', pre=True)
def id_must_be_str(cls, v):
if isinstance(v, int):
v = str(v)
return v
@validator('files', pre=True)
def files_must_be_list_of_str(cls, v):
if isinstance(v, str):
v = [v]
return v
obj = Demo.parse_obj({'id': 1, 'files': '/data/1.txt'})
print(type(obj.id))
print(type(obj.files))
Pydantic has built-in validation logic built-in for most of the common types out there. This includes str
. It just so happens that the default string validator simply coerces values of type int
, float
or Decimal
to str
by default. (see str_validator
source)
This means even if you annotate id
as str
, but pass an int
value, the model will initialize properly without validation error and the id
value will be the str
version of that value. (e.g. str(42)
gives "42"
)
list
also has a default validator built-in, but in this case it may be not what you want. If it encounters a non-list
value, but sees that it is a sequence (or a generator), it again coerces it to a list
. (see list_validator
source) In this case, since the value you might pass to it will be a str
and a str
is a sequence, the outcome would be a list of single-character strings from the initial string. (e.g. list("abc")
gives ["a", "b", "c"]
)
So for list[str]
you will likely need your own custom pre=True
validator to perform whatever you deem necessary with the str
value to turn it into a list[str]
.
Example:
from pydantic import BaseModel, validator
class Demo(BaseModel):
id: str
files: list[str]
@validator("files", pre=True)
def str_to_list_of_str(cls, v: object) -> object:
if isinstance(v, str):
return v.split(",")
return v
if __name__ == "__main__":
obj = Demo.parse_obj({"id": 42, "files": "foo,bar,baz"})
print(obj)
print(type(obj.id), type(obj.files))
Output:
id='42' files=['foo', 'bar', 'baz']
<class 'str'> <class 'list'>
As you can see, you don’t even need any additional id
field logic, if your values are int
because they end up as str
on the model instance.
Given the following data model:
class Demo(BaseModel):
id: Union[int, str]
files: Union[str, List[str]]
Is there a way to tell pydantic
to always convert id
to str
type and files
to List[str]
type automatically when I access them, instead of doing this manually every time.
I figure out how to make it after get help from the maintainer. The key point is to remove Union
from the type definition and use a pre-process hook to convert the value before validation, here is the sample code:
from pydantic import BaseModel, validator
from typing import List
class Demo(BaseModel):
id: str
files: List[str]
@validator('id', pre=True)
def id_must_be_str(cls, v):
if isinstance(v, int):
v = str(v)
return v
@validator('files', pre=True)
def files_must_be_list_of_str(cls, v):
if isinstance(v, str):
v = [v]
return v
obj = Demo.parse_obj({'id': 1, 'files': '/data/1.txt'})
print(type(obj.id))
print(type(obj.files))
Pydantic has built-in validation logic built-in for most of the common types out there. This includes str
. It just so happens that the default string validator simply coerces values of type int
, float
or Decimal
to str
by default. (see str_validator
source)
This means even if you annotate id
as str
, but pass an int
value, the model will initialize properly without validation error and the id
value will be the str
version of that value. (e.g. str(42)
gives "42"
)
list
also has a default validator built-in, but in this case it may be not what you want. If it encounters a non-list
value, but sees that it is a sequence (or a generator), it again coerces it to a list
. (see list_validator
source) In this case, since the value you might pass to it will be a str
and a str
is a sequence, the outcome would be a list of single-character strings from the initial string. (e.g. list("abc")
gives ["a", "b", "c"]
)
So for list[str]
you will likely need your own custom pre=True
validator to perform whatever you deem necessary with the str
value to turn it into a list[str]
.
Example:
from pydantic import BaseModel, validator
class Demo(BaseModel):
id: str
files: list[str]
@validator("files", pre=True)
def str_to_list_of_str(cls, v: object) -> object:
if isinstance(v, str):
return v.split(",")
return v
if __name__ == "__main__":
obj = Demo.parse_obj({"id": 42, "files": "foo,bar,baz"})
print(obj)
print(type(obj.id), type(obj.files))
Output:
id='42' files=['foo', 'bar', 'baz']
<class 'str'> <class 'list'>
As you can see, you don’t even need any additional id
field logic, if your values are int
because they end up as str
on the model instance.