Is it possible to programmatically generate a pyi file from an instantiated class?

Question:

I’m creating a class from a dictionary like this:

class MyClass:
    def __init__(self, dictionary):
        for k, v in dictionary.items():
            setattr(self, k, v)

I’m trying to figure out how I can get Intellisense for this dynamically generated class. Most IDEs can read pyi files for this sort of thing.

I don’t want to write out a pyi file manually though.

Is it possible to instantiate this class and programmatically write a pyi file to disk from it?

mypy has the stubgen tool, but I can’t figure out if it’s possible to use it this way.

Can I import stubgen from mypy and feed it MyClass(<some dict>) somehow?

Asked By: red888

||

Answers:

Static analysis programs like stubgen are the wrong tool for analysing a class populated dynamically, because they can’t see the source code of your fully-formed class to give you the stub of the class. You have to do the stub generation at runtime by running the source code to populate your instance attributes first.


Let’s say that you have a dynamically-populated class, as in your example,

class MyClass:
    def __init__(self, dictionary: dict[str, object]) -> None:
        k: str
        v: object
        for k, v in dictionary.items():
            setattr(self, k, v)

and you pass in this dictionary to the constructor,

import statistics

instance: MyClass = MyClass({"a": 1, "b": "my_string", "distribution": statistics.NormalDist(0.0, 1.0)})

and you want this as your output:

import statistics

class MyClass:
    a: int
    b: str
    distribution: statistics.NormalDist

    def __init__(self, dictionary: dict[str, object]) -> None:
        ...

The easiest way to generate the output above is to hook into instance creation and initialisation, so you don’t affect whatever __new__ or __init__ chained super calls which already exist on your class. This can be done via a metaclass’s __call__ method:

class _PostInitialisationMeta(type):

    """
    Metaclass for classes subject to dynamic stub generation
    """

    def __call__(
        cls, dictionary: dict[str, object], *args: object, **kwargs: object
    ) -> object:

        """
        Override instance creation and initialisation. Generate a string representing
        the class's stub definition suitable for a `.pyi` file.

        Parameters
        ----------
        dictionary
            Mapping from instance attribute names to attribute values
        *args
        **kwargs
            Other positional and keyword arguments to the class's `__new__` and
            `__init__` methods

        Returns
        -------
        object
            Created instance
        """

        instance: object = super().__call__(dictionary, *args, **kwargs)
        <generate string here>
        return instance

You can then parse the class into an abstract syntax tree, modify the tree by adding, removing, or transforming nodes, then unparse the transformed tree. Here’s one possible implementation using the Python standard library’s ast.NodeVisitor:

Python 3.9+ only

from __future__ import annotations

import ast
import inspect
import typing as t


if t.TYPE_CHECKING:

    class _SupportsBodyStatements(t.Protocol):
        body: list[ast.stmt]


_CLASS_TO_STUB_SOURCE_DICT: t.Final[dict[type, str]] = {}


class _PostInitialisationMeta(type):

    """
    Metaclass for classes subject to dynamic stub generation
    """

    def __call__(
        cls, dictionary: dict[str, object], *args: object, **kwargs: object
    ) -> object:

        """
        Override instance creation and initialisation. The first time an instance of a
        class is created and initialised, cache a string representing the class's stub
        definition suitable for a `.pyi` file.

        Parameters
        ----------
        dictionary
            Mapping from instance attribute names to attribute values
        *args
        **kwargs
            Other positional and keyword arguments to the class's `__new__` and
            `__init__` methods

        Returns
        -------
        object
            Created instance
        """

        instance: object = super().__call__(dictionary, *args, **kwargs)
        _DynamicClassStubsGenerator.cache_stub_for_dynamic_class(cls, dictionary)
        return instance


def _remove_docstring(node: _SupportsBodyStatements, /) -> None:

    """
    Removes a docstring node if it exists in the given node's body
    """

    first_node: ast.stmt = node.body[0]
    if (
        isinstance(first_node, ast.Expr)
        and isinstance(first_node.value, ast.Constant)
        and (type(first_node.value.value) is str)
    ):
        node.body.pop(0)


def _replace_body_with_ellipsis(node: _SupportsBodyStatements, /) -> None:

    """
    Replaces the body of a given node with a single `...`
    """

    node.body[:] = [ast.Expr(ast.Constant(value=...))]


class _DynamicClassStubsGenerator(ast.NodeVisitor):

    """
    Generate and cache stubs for class instances whose instance variables are populated
    dynamically
    """

    @classmethod
    def cache_stub_for_dynamic_class(
        StubsGenerator, Class: type, dictionary: dict[str, object], /
    ) -> None:

        # Disallow stubs generation if the stub source is already generated
        try:
            _CLASS_TO_STUB_SOURCE_DICT[Class]
        except KeyError:
            pass
        else:
            return

        # Get class's source code
        src: str = inspect.getsource(Class)
        module_tree: ast.Module = ast.parse(src)

        class_statement: ast.stmt = module_tree.body[0]
        assert isinstance(class_statement, ast.ClassDef)

        # Strip unnecessary details from class body
        stubs_generator: _DynamicClassStubsGenerator = StubsGenerator()
        stubs_generator.visit(module_tree)

        # Adds the following:
        #  - annotated instance attributes on the class body
        #  - import statements for non-builtins
        # --------------------------------------------------
        added_import_nodes: list[ast.stmt] = []
        added_class_nodes: list[ast.stmt] = []
        k: str
        v: object
        for k, v in dictionary.items():
            value_type: type = type(v)
            value_type_name: str = value_type.__qualname__
            value_type_module_name: str = value_type.__module__

            annotated_assignment_statement: ast.stmt = ast.parse(
                f"{k}: {value_type_name}"
            ).body[0]
            assert isinstance(annotated_assignment_statement, ast.AnnAssign)
            added_class_nodes.append(annotated_assignment_statement)
            if value_type_module_name != "builtins":
                annotation_expression: ast.expr = (
                    annotated_assignment_statement.annotation
                )
                assert isinstance(annotation_expression, ast.Name)
                annotation_expression.id = (
                    f"{value_type_module_name}.{annotation_expression.id}"
                )
                added_import_nodes.append(
                    ast.Import(names=[ast.alias(name=value_type_module_name)])
                )

        module_tree.body[:] = [*added_import_nodes, *module_tree.body]
        class_statement.body[:] = [*added_class_nodes, *class_statement.body]
        _CLASS_TO_STUB_SOURCE_DICT[Class] = ast.unparse(module_tree)

    def visit_ClassDef(self, node: ast.ClassDef) -> None:
        _remove_docstring(node)
        node.keywords = []  # Clear metaclass and other keywords in class definition
        self.generic_visit(node)

    def visit_FunctionDef(self, node: ast.FunctionDef) -> None:
        _replace_body_with_ellipsis(node)

    def visit_AsyncFunctionDef(self, node: ast.AsyncFunctionDef) -> None:
        _replace_body_with_ellipsis(node)

You can then run your class as usual, and then inspect what’s stored in the cache _CLASS_TO_STUB_SOURCE_DICT:

class MyClass(metaclass=_PostInitialisationMeta):
    def __init__(self, dictionary: dict[str, object]) -> None:
        k: str
        v: object
        for k, v in dictionary.items():
            setattr(self, k, v)

>>> MyClass({"a": 1, "b": "my_string", "distribution": statistics.NormalDist(0.0, 1.0)})
>>> src: str
>>> for src in _CLASS_TO_STUB_SOURCE_DICT.values():
...     print(src)
...
import statistics

class MyClass:
    a: int
    b: str
    distribution: statistics.NormalDist

    def __init__(self, dictionary: dict[str, object]) -> None:
        ...

In practice, .pyi files form the type interfaces on a per-module basis, so the implementation above isn’t immediately usable as it is only for a class. You also have to do much more processing with other kinds of nodes in your .pyi module, decide what to do with unannotated nodes, repeated imports, etc., before writing the source to a .pyi file. This is where stubgen may come in handy – it can analyse the static parts of your module, and you can take that output and write an ast.NodeTransformer to transform that output into the classes you’ve generated dynamically.

Answered By: dROOOze
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.