Python type hinting for a generic mutable tuple / fixed length sequence with multiple types

Question:

I am currently working on adding type hints to a project and can’t figure out how to get this right. I have a list of lists, with the nested list containing two elements of type int and float. The first element of the nested list is always an int and the second is always a float.

my_list = [[1000, 5.5], [1432, 2.2], [1234, 0.3]]

I would like to type annotate it so that unpacking the inner list in for loops or loop comprehensions keeps the type information. I could change the inner lists to tuples and would get what I’m looking for:

def some_function(list_arg: list[tuple[int, float]]): pass

However, I need the inner lists to be mutable. Is there a nice way to do this for lists? I know that abstract classes like Sequence and Collection do not support multiple types.

Asked By: kman

||

Answers:

The mutability of the datastructure is not compatible with a static length and an invariant order of the types contained. It is not possible to statically analyze the sequence unpacking if you can sort, append, prepend or insert records into it.

Imagine the following snippet

def some_function(list_arg: list[int, float]): # Invalid Syntax :)
    myint, myfloat = list_arg # ok?

    list_arg.sort()
    myint, myfloat = list_arg # ??????

    if random.random() < .5:
        list_arg.insert(1, 'yet another type!')
    myint, myfloat = list_arg  # 50% chance of an actual runtime error
                               # fat chance for any static analysis!

If your sequence mutability is an imperative, write a union or other richer type hints for the potential types of the contained objects

def some_function(list_arg: list[list[A|B]]): pass

or use a supertype. Because int is duck type compatible with float https://mypy.readthedocs.io/en/latest/duck_type_compatibility.html#duck-type-compatibility :

def some_function(list_arg: list[list[float]]): pass

If your datastructure will not be actually mutated, then choosing a list instead of a tuple was the first mistake.

Answered By: N1ngu

I think the question highlights a fundamental difference between statically typed Python and dynamically typed Python. For someone who is used to dynamically typed Python (or Perl or JavaScript or any number of other scripting languages), it’s perfectly normal to have diverse data types in a list. It’s convenient, flexible, and doesn’t require you to define custom data types. However, when you introduce static typing, you step into a tighter box that requires more rigorous design.

As several others have already pointed out, type annotations for lists require all elements of the list to be the same type, and don’t allow you to specify a length. Rather than viewing this as a shortcoming of the type system, you should consider that the flaw is in your own design. What you are really looking for is a class with two data members. The first data member is named 0, and has type int, and the second is named 1, and has type float. As your friend, I would recommend that you define a proper class, with meaningful names for these data members. As I’m not sure what your data type represents, I’ll make up names, for illustration.

class Sample:
    def __init__(self, atomCount: int, atomicMass: float):
        self.atomCount = atomCount
        self.atomicMass = atomicMass

This not only solves the typing problem, but also gives a major boost to readability. Your code would now look more like this:

my_list = [Sample(1000, 5.5), Sample(1432, 2.2), Sample(1234, 0.3)]

def some_function(list_arg: list[Sample]): pass

I do think it’s worth highlighting Stef’s comment, which points to this question. The answers given highlight two useful features related to this.

First, as of Python 3.7, you can mark a class as a data class, which will automatically generate methods like __init__(). The Sample class would look like this, using the @dataclass decorator:

from dataclasses import dataclass

@dataclass
class Sample:
    atomCount: int
    atomicMass: float

Another answer to that question mentions a PyPi package called recordclass, which it says is basically a mutable namedtuple. The typed version is called RecordClass

from recordclass import RecordClass

class Sample(RecordClass):
    atomCount: int
    atomicMass: float
Answered By: TallChuck

One small addition to the TallChuck‘s response:

from recordclass import dataobject

class Sample(dataobject):
    atomCount: int
    atomicMass: float

The class Sample doesn’t support namedtuple-like API. But it support some kind of dataclasses-like API.

Answered By: intellimath