Using dataclasses.MISSING as optional parameter value with a Python dataclass?

Question:

I want to make the set parameter optional but still allow None to be a valid value. Based on the documentation, it suggested that dataclasses.MISSING could be used a default value to assist in this.

As shown above, the MISSING value is a sentinel object used to detect
if some parameters are provided by the user. This sentinel is used
because None is a valid value for some parameters with a distinct
meaning. No code should directly use the MISSING value.

But by using this as follows:

import dataclasses
from dataclasses import dataclass, field

@dataclass
class Var:
    get: list
    set: list = dataclasses.MISSING

    def __post_init__(self):
        if self.set is dataclasses.MISSING:
            self.set = self.get

 print(Var(get=['Title']))  

I am getting an error:

Traceback (most recent call last):
File "main.py", line 31, in <module>
print(Var(get=['Title']))
TypeError: __init__() missing 1 required positional argument: 'set'
Asked By: RishiD

||

Answers:

I don’t know if you can use dataclasses.MISSING in this way, so I would simply use a dedicated enum. Since it’s an enum, it’s gauranteed to be identical only with itself, so it should give use what you want:

from dataclasses import dataclass
from enum import Enum

_field_status = Enum("FieldStatus", "UNSET")

@dataclass
class Var:
    get: list
    set: list = _field_status.UNSET

    def __post_init__(self):
        if self.set is _field_status.UNSET:
            self.set = self.get

print(Var(get = [7]))

print(Var(get=[7], set=[8]))
print(Var(get=[7], set=None))

Obviously this prevents the user from setting set to _field_status.UNSET, but presumably they don’t need to do that.

Note that I am slightly confused as to why None is a valid value for something which is hinted as a list, but the principle stands.

Answered By: 2e0byo

No code should directly use the MISSING value.

This part above is noted in the docs for a reason. Therefore, we should avoid use of the MISSING usage (and import) in our application code if possible. In this case, using MISSING is not at all applicable to our use case.

Assumed usage (by avoiding directly working with the MISSING sentinel value, and instead using dataclasses.field(...):

from dataclasses import dataclass, field
from typing import Optional


@dataclass
class Var:
    get: list[str]
    set: Optional[list[str]] = field(default=None)


print(Var(get=['Title']))
# Var(get=['Title'], set=None)

But where is MISSING actually used?

The MISSING is a sentinel object that the dataclasses module uses behind the scenes, for its magic.

You can inspect the source code for dataclasses.field and find a clear usage of it there:

def field(*, default=MISSING, default_factory=MISSING, init=True, repr=True,
          hash=None, compare=True, metadata=None):

You’ll see the declared default value for fields such as default is default=MISSING instead of default=None. This is done mainly to identify if the user actually passes in a value for default or default_factory to the factory function fields. For example, it’s perfectly valid to pass field(default=None) as we did in example above; however, since the default value is actually MISSING instead, dataclasses is able to detect that a value has been passed for this parameter (a value of None).

How MISSING is declared

If you inspect the source code of the dataclasses module, or by Ctrl (Command on Mac) + left clicking on the keyword MISSING anywhere in code, you can see how MISSING is actually declared:

# A sentinel object to detect if a parameter is supplied or not.  Use
# a class to give it a better repr.
class _MISSING_TYPE:
    pass
MISSING = _MISSING_TYPE()

A Potential Solution

Piggy-backing off how the dataclasses module defines MISSING, you could theoretically define your own (empty) class, and then instantiate the class.

I, however, feel that class instantiation could be obviated in this scenario. Here’s a one-liner to create a sentinel class / type:

class _UNSET: ...  # here the ellipsis (`...`) is essentially the same as `pass`

Usage, then, would be as follows:

from __future__ import annotations

from dataclasses import dataclass


class _UNSET: ...


@dataclass
class Var:
    get: list
    set: list | None = _UNSET

    def __post_init__(self):
        if self.set is _UNSET:
            self.set = self.get


print(Var(get=[7]))
print(Var(get=[7], set=[8]))
print(Var(get=[7], set=None))

This correctly distinguishes between scenarios when set is omitted in the constructor, or when a value for set is specified, such as set=None.

The result, is also as expected:

Var(get=[7], set=[7])
Var(get=[7], set=[8])
Var(get=[7], set=None)
Answered By: rv.kvetch
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.