type hinting within a class

Question:

class Node:
    def append_child(self, node: Node):
       if node != None:
        self.first_child = node
    self.child_nodes += [node]

How do I do node: Node? Because when I run it, it says name 'Node' is not defined.

Should I just remove the : Node and instance check it inside the function?
But then how could I access node‘s properties (which I would expect to be instance of Node class)?

I don’t know how implement type casting in Python, BTW.

Asked By: Ben Sat

||

Answers:

"self" references in type checking are typically done using strings:

class Node:
    def append_child(self, node: 'Node'):
       if node != None:
        self.first_child = node
    self.child_nodes += [node]

This is described in the "Forward references" section of PEP-0484.

Please note that this doesn’t do any type-checking or casting. This is a type hint which python (normally) disregards completely1. However, third party tools (e.g. mypy), use type hints to do static analysis on your code and can generate errors before runtime.

Also, starting with python3.7, you can implicitly convert all of your type-hints to strings within a module by using the from __future__ import annotations (and in python4.0, this will be the default).

1The hints are introspectable — So you could use them to build some kind of runtime checker using decorators or the like if you really wanted to, but python doesn’t do this by default.

Answered By: mgilson

If you just want an answer to the question, go read mgilson’s answer.

mgilson’s answer provides a good explanation of how you should work around this limitation of Python. But I think it’s also important to have a good understanding of why this doesn’t work, so I’m going to provide that explanation.

Python is a little different from other languages. In Python, there’s really no such thing as a “declaration.” As far as Python is concerned, code is just code. When you import a module, Python creates a new namespace (a place where global variables can live), and then executes each line of the module from top to bottom. def foo(args): code is just a compound statement that bundles a bunch of source code together into a function and binds that function to the name foo. Similarly, class Bar(bases): code creates a class, executes all of the code immediately (inside a separate namespace which holds any class-level variables that might be created by the code, particularly including methods created with def), and then binds that class to the name Bar. It has to execute the code immediately, because all of the methods need to be created immediately. Because the code gets executed before the name has been bound, you can’t refer to the class at the top level of the code. It’s perfectly fine to refer to the class inside of a method, however, because that code doesn’t run until the method gets called.

(You might be wondering why we can’t just bind the name first and then execute the code. It turns out that, because of the way Python implements classes, you have to know which methods exist up front, before you can even create the class object. It would be possible to create an empty class and then bind all of the methods to it one at a time with attribute assignment (and indeed, you can manually do this, by writing class Bar: pass and then doing def method1():...; Bar.method1 = method1 and so on), but this would result in a more complicated implementation, and be a little harder to conceptualize, so Python does not do this.)

To summarize in code:

class C:
    C  # NameError: C doesn't exist yet.
    def method(self):
        return C  # This is fine.  By the time the method gets called, C will exist.
C  # This is fine; the class has been created by the time we hit this line.
Answered By: Kevin

Python 3.7 and Python 4.03.10 onwards

PEP 563 introduced postponed evaluations, stored in __annotations__ as strings. A user can enable this through the __future__ directive:

from __future__ import annotations

This makes it possible to write:

class C:
    a: C
    def foo(self, b: C):
        ...

Starting in Python 3.10 (release planned 2021-10-04), this behaviour will be default.

Edit 2020-11-15: Originally it was announced to be mandatory starting in Python 4.0, but now it appears this will be default already in Python 3.10, which is expected 2021-10-04. This surprises me as it appears to be a violation of the promise in __future__ that this backward compatibility would not be broken until Python 4.0. Maybe the developers consider than 3.10 is 4.0, or maybe they have changed their mind. See also Why did __future__ MandatoryRelease for annotations change between 3.7 and 3.8?.

Answered By: gerrit

In Python > 3.7 you can use dataclass. You can also annotate dataclass.

In this particular example Node references itself and if you run it you will get

NameError: name 'Node' is not defined

To overcome this error you have to include:

from __future__ import annotations

It must be the first line in a module. In Python 4.0 and above you don’t have to include annotations

from __future__ import annotations
from dataclasses import dataclass

@dataclass
class Node:
    value: int
    left: Node
    right: Node

    @property
    def is_leaf(self) -> bool:
        """Check if node is a leaf"""
        return not self.left and not self.right

Example:

node5 = Node(5, None, None)
node25 = Node(25, None, None)
node40 = Node(40, None, None)
node10 = Node(10, None, None)

# balanced tree
node30 = Node(30, node25, node40)
root = Node(20, node10, node30)

# unbalanced tree
node30 = Node(30, node5, node40)
root = Node(20, node10, node30)
Answered By: Vlad Bezden
Categories: questions Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.