python dataclass `__init_subclass__` does not load the fields from subclass

Question:

Please check the code below

import dataclasses


@dataclasses.dataclass
class A:
    a: int = 10

    def __init_subclass__(cls, **kwargs):
        for f in dataclasses.fields(cls):
            print("     ", cls, f.name)


print("Defining Subclass B")


@dataclasses.dataclass
class B(A):
    b: int = 100


print("Defining Subclass C")


@dataclasses.dataclass
class C(B):
    c: int = 1000

The output is

Defining Subclass B
      <class '__main__.B'> a
Defining Subclass C
      <class '__main__.C'> a
      <class '__main__.C'> b

I was expecting

Defining Subclass B
      <class '__main__.B'> a
      <class '__main__.B'> b
Defining Subclass C
      <class '__main__.C'> a
      <class '__main__.C'> b
      <class '__main__.C'> c

Clearly the dunder method __init_subclass__ does not have the idea of fields in subclass yet as it is not loaded. Please let me know how can I get the expected output.

Asked By: Praveen Kulkarni

||

Answers:

So, a decorator works the following way:

@some_decorator
class A:
    pass

Is equivalent to:

class A:
   pass
A = some_decorator(A)

So, the dataclass decorator which adds the __dataclass_fields__ attribute to the class object hasn’t run when __init_subclass__ runs, because the entire class objects has been created already.

You could manually inspect the __annotations__ (which is what dataclass relies on to create field objects anyway). Note, you will have to manually inspect all the classes in the MRO, taking care of doing it in reverse order and guarding against potential classes that lack annotations, so something like:

import dataclasses


@dataclasses.dataclass
class A:
    a: int = 10

    def __init_subclass__(cls, **kwargs):
        for klass in  reversed(cls.mro()):
            # in case a base class lacks annotations, e.g. object
            annotations = getattr(klass, '__annotations__', {})
            for name, value in annotations.items():
                print("     ", cls, name, value)

print("Defining Subclass B")


@dataclasses.dataclass
class B(A):
    b: int = 100


print("Defining Subclass C")


@dataclasses.dataclass
class C(B):
    c: int = 1000

Could work for you.

The above prints the following output for me:

Defining Subclass B
      <class '__main__.B'> a <class 'int'>
      <class '__main__.B'> b <class 'int'>
Defining Subclass C
      <class '__main__.C'> a <class 'int'>
      <class '__main__.C'> b <class 'int'>
      <class '__main__.C'> c <class 'int'>
Answered By: juanpa.arrivillaga

One easy approach is to call the dataclass decorator inside __init_subclass__() (and remove the @dataclass decorator on the subclasses themselves). This way you force the dataclass initialization before your own logic executes.

@dataclasses.dataclass
class A:
    a: int = 10

    def __init_subclass__(cls, **kwargs):
        # Note: unlike some decorators, this works
        # because it modifies the class in place.
        dataclasses.dataclass(cls)
        for f in dataclasses.fields(cls):
            print("     ", cls, f.name)


print("Defining Subclass B")


# No "@dataclasses.dataclass" here.
class B(A):
    b: int = 100


print("Defining Subclass C")


# No "@dataclasses.dataclass" here.
class C(B):
    c: int = 1000
Answered By: mikenerone