Type hint for NumPy ndarray dtype?

Question:

I would like a function to include a type hint for NumPy ndarray‘s alongside with its dtype.

With lists, for example, one could do the following…

def foo(bar: List[int]):
   ...

…in order to give a type hint that bar has to be list consisting of int‘s.

Unfortunately, this syntax throws exceptions for NumPy ndarray:

def foo(bar: np.ndarray[np.bool]):
   ...

> np.ndarray[np.bool]) (...) TypeError: 'type' object is not subscriptable

Is it possible to give dtype-specific type hints for np.ndarray?

Asked By: daniel451

||

Answers:

To the best of my knowledge it’s not possible yet to specify dtype in numpy array type hints in function signatures. It is planned to be implemented at some point in the future. See numpy GitHub issue #7370 and numpy-stubs GitHub for more details on the current development status.

Answered By: Xukrao

You could check out nptyping:

from nptyping import NDArray, Bool

def foo(bar: NDArray[Bool]):
   ...

Or you could just use strings for type hints:

def foo(bar: 'np.ndarray[np.bool]'):
   ...
Answered By: R H

Check out data-science-types package.

pip install data-science-types

MyPy now has access to Numpy, Pandas, and Matplotlib stubs.
Allows scenarios like:

# program.py

import numpy as np
import pandas as pd

arr1: np.ndarray[np.int64] = np.array([3, 7, 39, -3])  # OK
arr2: np.ndarray[np.int32] = np.array([3, 7, 39, -3])  # Type error

df: pd.DataFrame = pd.DataFrame({'col1': [1,2,3], 'col2': [4,5,6]}) # OK
df1: pd.DataFrame = pd.Series([1,2,3]) # error: Incompatible types in assignment (expression has type "Series[int]", variable has type "DataFrame")

Use mypy like normal.

$ mypy program.py

Usage with function-parameters

def f(df: pd.DataFrame):
    return df.head()

if __name__ == "__main__":
    x = pd.DataFrame({'col1': [1, 2, 3, 4, 5, 6]})
    print(f(x))

$ mypy program.py
> Success: no issues found in 1 source file

One informal solution for type documentation is the following:

from typing import TypeVar, Generic, Tuple, Union, Optional
import numpy as np

Shape = TypeVar("Shape")
DType = TypeVar("DType")


class Array(np.ndarray, Generic[Shape, DType]):
    """
    Use this to type-annotate numpy arrays, e.g.

        def transform_image(image: Array['H,W,3', np.uint8], ...):
            ...

    """
    pass


def func(arr: Array['N,2', int]):
    return arr*2


print(func(arr = np.array([(1, 2), (3, 4)])))

We’ve been using this at my company and made a MyPy checker that actually checks that the shapes work out (which we should release at some point).

Only thing is it doesn’t make PyCharm happy (ie you still get the nasty warning lines):

enter image description here

Answered By: Peter