Python dataclass, what's a pythonic way to validate initialization arguments?

Question:

What’s a pythonic way to validate the init arguments before instantiation w/o overriding dataclasses built-in init?

I thought perhaps leveraging the __new__ dunder-method would be appropriate?

from dataclasses import dataclass

@dataclass
class MyClass:
    is_good: bool = False
    is_bad: bool = False

    def __new__(cls, *args, **kwargs):
        instance: cls = super(MyClass, cls).__new__(cls, *args, **kwargs)
        if instance.is_good:
            assert not instance.is_bad
        return instance
Asked By: tgk

||

Answers:

Define a __post_init__ method on the class; the generated __init__ will call it if defined:

from dataclasses import dataclass

@dataclass
class MyClass:
    is_good: bool = False
    is_bad: bool = False

    def __post_init__(self):
        if self.is_good:
            assert not self.is_bad

This will even work when the replace function is used to make a new instance.

Answered By: ShadowRanger

The author of the dataclasses module made a conscious decision to not implement validators that are present in similar third party projects like attrs, pydantic, or marshmallow. And if your actual problem is within the scope of the one you posted, then doing the validation in the __post_init__ is completely fine.

But if you have more complex validation procedures or play with stuff like inheritance you might want to use one of the more powerful libraries I mentioned instead of dataclass. Just to have something to look at, this is what your example could look like using pydantic:

>>> from pydantic import BaseModel, validator
>>> class MyClass(BaseModel):
...     is_good: bool = False
...     is_bad: bool = False
...
...     @validator('is_bad')
...     def check_something(cls, v, values):
...         if values['is_good'] and v:
...             raise ValueError("Can not be both good and bad now, can it?")
...         return v
...     
>>> MyClass(is_good=True, is_bad=False)  # this would be a valid instance
MyClass(is_good=True, is_bad=False)
>>> MyClass(is_good=True, is_bad=True)   # this wouldn't
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "pydantic/main.py", line 283, in pydantic.main.BaseModel.__init__
pydantic.error_wrappers.ValidationError: 1 validation error for MyClass
is_bad
  Can not be both good and bad now, can it? (type=value_error)
Answered By: Arne

You can try this:

from dataclasses import dataclass

from validated_dc import ValidatedDC


@dataclass
class MyClass(ValidatedDC):
    is_good: bool = False
    is_bad: bool = False


instance = MyClass()
assert instance.get_errors() is None
assert instance == MyClass(is_good=False, is_bad=False)

data = {'is_good': True, 'is_bad': True}
instance = MyClass(**data)
assert instance.get_errors() is None

data = {'is_good': 'bad_value', 'is_bad': True}
instance = MyClass(**data)
assert instance.get_errors()
print(instance.get_errors())
# {'is_good': [BasicValidationError(value_repr='bad_value', value_type=<class 'str'>, annotation=<class 'bool'>, exception=None)]}

# fix
instance.is_good = True
assert instance.is_valid()
assert instance.get_errors() is None

ValidatedDC: https://github.com/EvgeniyBurdin/validated_dc

Answered By: Evgeniy_Burdin