type hint for an instance of a non specific dataclass
Question:
I have a function that accepts an instance of any dataclass
.
what would be an appropriate type hint for it ?
haven’t found something official in the python documentation
this is what I have been doing, but i don’t think it’s correct
from typing import Any, NewType
DataClass = NewType('DataClass', Any)
def foo(obj: DataClass):
...
another idea is to use a Protocol
with these class attributes __dataclass_fields__
, __dataclass_params__
.
Answers:
Despite its name, dataclasses.dataclass
doesn’t expose a class interface. It just allows you to declare a custom class in a convenient way that makes it obvious that it is going to be used as a data container. So, in theory, there is little opportunity to write something that only works on dataclasses, because dataclasses really are just ordinary classes.
In practice, there a couple of reasons why you would want to declare dataclass-only functions anyway, and something like this is how you should go about it:
from dataclasses import dataclass
from typing import ClassVar, Dict, Protocol
class IsDataclass(Protocol):
# as already noted in comments, checking for this attribute is currently
# the most reliable way to ascertain that something is a dataclass
__dataclass_fields__: ClassVar[Dict]
def dataclass_only(x: IsDataclass):
... # do something that only makes sense with a dataclass
@dataclass
class Foo:
pass
class Bar:
pass
dataclass_only(Foo()) # a static type check should show that this line is fine ..
dataclass_only(Bar()) # .. and this one is not
This approach is also what you alluded to in your question. If you want to go for it, keep in mind that you’ll need a third party library such as mypy
to do the static type checking for you, and if you are on python 3.7
or earlier, you need to manually install typing_extensions
since Protocol
only became part of the standard library in 3.8
.
Also noted that older version of mypy (>=0.982
) mistakenly expect __dataclass_fields__
to be an instance attribute, so the protocol should be just __dataclass_fields__: Dict
[1].
When I first wrote it, this post also featured The Old Way of Doing Things, back when we had to make do without type checkers. I’m leaving it up, but it’s not recommended to handle this kind of feature with runtime-only failures any more:
from dataclasses import is_dataclass
def dataclass_only(x):
"""Do something that only makes sense with a dataclass.
Raises:
ValueError if something that is not a dataclass is passed.
... more documentation ...
"""
if not is_dataclass(x):
raise ValueError(f"'{x.__class__.__name__}' is not a dataclass!")
...
[1]Kudos to @Kound for updating and testing the ClassVar
behavior.
There is a helper function called is_dataclass
that can be used, its exported from dataclasses
.
Basically what it does is this:
def is_dataclass(obj):
"""Returns True if obj is a dataclass or an instance of a
dataclass."""
cls = obj if isinstance(obj, type) else type(obj)
return hasattr(cls, _FIELDS)
It gets the type of the instance using type, or if the object extends type, the object itself.
It then checks if the variable _FIELDS, which equals __dataclass_fields__
, exists on this object. This is basically equivalent to the other answers here.
To "type" dataclass i would do something like this:
class DataclassProtocol(Protocol):
__dataclass_fields__: Dict
__dataclass_params__: Dict
__post_init__: Optional[Callable]
You can indeed use a Protocol
, but by I suggest @
decorating that Protocol
as a runtime_checkable
dataclass
:
@runtime_checkable
@dataclasses.dataclass
class DataclassProtocol(Protocol):
pass
The above results in:
- type-hinting with DataclassProtocol is possible and makes sense to type checkers (mypy 0.982, PyCharm 2022.2.3 CE)
isinstance(obj, DataclassProtocol)
is equivalent to dataclasses.is_dataclass(obj)
- because
dataclasses.is_dataclass(DataclassProtocol)
, type checkers’ special handling of dataclasses
work
DataclassProtocol
does not require use of internal dataclass
fields
The first is also accomplished by the previously given Protocol
s. The second results from @
decorating with runtime_checkable
.
The latter two points rely on @
decorating by dataclass
.
While this answers the question, personally I’d want to subclass DataclassProtocol
into a DataclassInstanceProtocol
, which specializes for not isinstance(obj, type)
. But until now, I couldn’t find that.
I have a function that accepts an instance of any dataclass
.
what would be an appropriate type hint for it ?
haven’t found something official in the python documentation
this is what I have been doing, but i don’t think it’s correct
from typing import Any, NewType
DataClass = NewType('DataClass', Any)
def foo(obj: DataClass):
...
another idea is to use a Protocol
with these class attributes __dataclass_fields__
, __dataclass_params__
.
Despite its name, dataclasses.dataclass
doesn’t expose a class interface. It just allows you to declare a custom class in a convenient way that makes it obvious that it is going to be used as a data container. So, in theory, there is little opportunity to write something that only works on dataclasses, because dataclasses really are just ordinary classes.
In practice, there a couple of reasons why you would want to declare dataclass-only functions anyway, and something like this is how you should go about it:
from dataclasses import dataclass
from typing import ClassVar, Dict, Protocol
class IsDataclass(Protocol):
# as already noted in comments, checking for this attribute is currently
# the most reliable way to ascertain that something is a dataclass
__dataclass_fields__: ClassVar[Dict]
def dataclass_only(x: IsDataclass):
... # do something that only makes sense with a dataclass
@dataclass
class Foo:
pass
class Bar:
pass
dataclass_only(Foo()) # a static type check should show that this line is fine ..
dataclass_only(Bar()) # .. and this one is not
This approach is also what you alluded to in your question. If you want to go for it, keep in mind that you’ll need a third party library such as mypy
to do the static type checking for you, and if you are on python 3.7
or earlier, you need to manually install typing_extensions
since Protocol
only became part of the standard library in 3.8
.
Also noted that older version of mypy (>=0.982
) mistakenly expect __dataclass_fields__
to be an instance attribute, so the protocol should be just __dataclass_fields__: Dict
[1].
When I first wrote it, this post also featured The Old Way of Doing Things, back when we had to make do without type checkers. I’m leaving it up, but it’s not recommended to handle this kind of feature with runtime-only failures any more:
from dataclasses import is_dataclass
def dataclass_only(x):
"""Do something that only makes sense with a dataclass.
Raises:
ValueError if something that is not a dataclass is passed.
... more documentation ...
"""
if not is_dataclass(x):
raise ValueError(f"'{x.__class__.__name__}' is not a dataclass!")
...
[1]Kudos to @Kound for updating and testing the ClassVar
behavior.
There is a helper function called is_dataclass
that can be used, its exported from dataclasses
.
Basically what it does is this:
def is_dataclass(obj):
"""Returns True if obj is a dataclass or an instance of a
dataclass."""
cls = obj if isinstance(obj, type) else type(obj)
return hasattr(cls, _FIELDS)
It gets the type of the instance using type, or if the object extends type, the object itself.
It then checks if the variable _FIELDS, which equals __dataclass_fields__
, exists on this object. This is basically equivalent to the other answers here.
To "type" dataclass i would do something like this:
class DataclassProtocol(Protocol):
__dataclass_fields__: Dict
__dataclass_params__: Dict
__post_init__: Optional[Callable]
You can indeed use a Protocol
, but by I suggest @
decorating that Protocol
as a runtime_checkable
dataclass
:
@runtime_checkable
@dataclasses.dataclass
class DataclassProtocol(Protocol):
pass
The above results in:
- type-hinting with DataclassProtocol is possible and makes sense to type checkers (mypy 0.982, PyCharm 2022.2.3 CE)
isinstance(obj, DataclassProtocol)
is equivalent todataclasses.is_dataclass(obj)
- because
dataclasses.is_dataclass(DataclassProtocol)
, type checkers’ special handling ofdataclasses
work DataclassProtocol
does not require use of internaldataclass
fields
The first is also accomplished by the previously given Protocol
s. The second results from @
decorating with runtime_checkable
.
The latter two points rely on @
decorating by dataclass
.
While this answers the question, personally I’d want to subclass DataclassProtocol
into a DataclassInstanceProtocol
, which specializes for not isinstance(obj, type)
. But until now, I couldn’t find that.