How to type hint pandas.NA as a possible output

Question:

I have Pandas lambda function which I use with .apply.

This function will output a dictionary with string keys and values that are either strings or pd.NA.

When I try to type hint the function I get an error:

def _the_function(x: str) -> dict[str, str | pd.NA]:
    ...
ERROR: Expected class type but received "NAType"

How can I tell it of the possible pd.NA value without having to import numpy and using its NaN type hint? My project has no need to import numpy.

Asked By: storms

||

Answers:

Your problem comes from the fact that pandas.NA is not a type. It is an instance (a singleton in fact) of the NAType class in Pandas. You need to use classes in type annotations.* More precisely, annotations must be made with instances of type (typically called classes) or special typing constructs like Union or generics.

You can fix this by importing and using that class in the type annotation:

import pandas as pd
from pandas._libs.missing import NAType
...

def _the_function(x: str) -> dict[str, str | NAType]:
    return {"foo": pd.NA}  # example to show annotations are correct

Running mypy over that code shows no errors.

The only problem is that _libs is a non-public module (as denoted by its name starting with _). This may be due to the NA singleton still being considered experimental. I don’t know. But importing from non-public modules is generally discouraged. I searched through the Pandas (and pandas-stubs) source and found no public re-import of the NAType class, so I see no other way around it.

If NA is still experimental, I suppose you know the risk you are taking when relying on it in your functions, so importing its class should not make much of a difference to you.

Hope this helps.


* Confusion can arise because Python allows using None in type annotations, even though strictly speaking it is also an object (also a singleton) of the NoneType class. But the good people working on type hints in Python decided to still allow None as a special case for convenience, even though it is not exactly consistent. As far as I know, that is the only such exception.

Answered By: Daniil Fajnberg