Truth value of empty set

Question:

I am interested in the truth value of Python sets like {'a', 'b'}, or the empty set set() (which is not the same as the empty dictionary {}). In particular, I would like to know whether bool(my_set) is False if and only if the set my_set is empty.

Ignoring primitive (such as numerals) as well as user-defined types, https://docs.python.org/3/library/stdtypes.html#truth says:

The following values are considered false:

  • […]
  • any empty sequence, for example, '', (), [].
  • any empty mapping, for example, {}.
  • […]

All other values are considered true

According to https://docs.python.org/3/library/stdtypes.html#sequence-types-list-tuple-range, a set is not a sequence (it is unordered, its elements do not have indices, etc.):

There are three basic sequence types: lists, tuples, and range objects.

And, according to https://docs.python.org/3/library/stdtypes.html#mapping-types-dict,

There is currently only one standard mapping type, the dictionary.

So, as far as I understand, the set type is not a type that can ever be False. However, when I try, bool(set()) evaluates to False.

Questions:

  • Is this a documentation problem, or am I getting something wrong?
  • Is the empty set the only set whose truth value is False?
Asked By: Peter Thomassen

||

Answers:

After looking at the source code for CPython, I would guess this is a documentation error, however, it could be implementation dependent and therefore would be a good issue to raise on the Python bug tracker.

Specifically, object.c defines the truth value of an item as follows:

int
PyObject_IsTrue(PyObject *v)
{
    Py_ssize_t res;
    if (v == Py_True)
        return 1;
    if (v == Py_False)
        return 0;
    if (v == Py_None)
        return 0;
    else if (v->ob_type->tp_as_number != NULL &&
             v->ob_type->tp_as_number->nb_bool != NULL)
        res = (*v->ob_type->tp_as_number->nb_bool)(v);
    else if (v->ob_type->tp_as_mapping != NULL &&
             v->ob_type->tp_as_mapping->mp_length != NULL)
        res = (*v->ob_type->tp_as_mapping->mp_length)(v);
    else if (v->ob_type->tp_as_sequence != NULL &&
             v->ob_type->tp_as_sequence->sq_length != NULL)
        res = (*v->ob_type->tp_as_sequence->sq_length)(v);
    else
        return 1;
    /* if it is negative, it should be either -1 or -2 */
    return (res > 0) ? 1 : Py_SAFE_DOWNCAST(res, Py_ssize_t, int);
}

We can clearly see that the value is value would be always true if it is not a boolean type, None, a sequence, or a mapping type, which would require tp_as_sequence or tp_as_mapping to be set.

Fortunately, looking at setobject.c shows that sets do implement tp_as_sequence, suggesting the documentation seems to be incorrect.

PyTypeObject PySet_Type = {
    PyVarObject_HEAD_INIT(&PyType_Type, 0)
    "set",                              /* tp_name */
    sizeof(PySetObject),                /* tp_basicsize */
    0,                                  /* tp_itemsize */
    /* methods */
    (destructor)set_dealloc,            /* tp_dealloc */
    0,                                  /* tp_print */
    0,                                  /* tp_getattr */
    0,                                  /* tp_setattr */
    0,                                  /* tp_reserved */
    (reprfunc)set_repr,                 /* tp_repr */
    &set_as_number,                     /* tp_as_number */
    &set_as_sequence,                   /* tp_as_sequence */
    0,                                  /* tp_as_mapping */
    /* ellipsed lines */
};

Dicts also implement tp_as_sequence, so it seems that although it is not a sequence type, it sequence-like, enough to be truthy.

In my opionion, the documentation should clarify this: mapping-like types, or sequence-like types will be truthy dependent on their length.

Edit As user2357112 correctly points out, tp_as_sequence and tp_as_mapping do not mean the type is a sequence or a map. For example, dict implements tp_as_sequence, and list implements tp_as_mapping.

Answered By: Alexander Huszagh

That part of the docs is poorly written, or rather, poorly maintained. The following clause:

instances of user-defined classes, if the class defines a __bool__() or __len__() method, when that method returns the integer zero or bool value False.

really applies to all classes, user-defined or not, including set, dict, and even the types listed in all the other clauses (all of which define either __bool__ or __len__). (In Python 2, None is false despite not having a __len__ or Python 2’s equivalent of __bool__, but that exception is gone since Python 3.3.)

I say poorly maintained because this section has been almost unchanged since at least Python 1.4, and maybe earlier. It’s been updated for the addition of False and the removal of separate int/long types, but not for type/class unification or the introduction of sets.

Back when the quoted clause was written, user-defined classes and built-in types really did behave differently, and I don’t think built-in types actually had __bool__ or __len__ at the time.

Answered By: user2357112

The documentation for __bool__ states that this method is called for truth value testing and if it is not defined then __len__ is evaluated:

Called to implement truth value testing and the built-in operation bool(); […] When this method is not defined, __len__() is called, if it is defined, and the object is considered true if its result is nonzero. If a class defines neither __len__() nor __bool__(), all its instances are considered true.

This holds for any Python object. As we can see set does not define a method __bool__:

>>> set.__bool__
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: type object 'set' has no attribute '__bool__'

so the truth testing falls back on __len__:

>>> set.__len__
<slot wrapper '__len__' of 'set' objects>

Therefore only an empty set (zero-length) is considered false.

The part for truth value testing in the documentation is not complete with regard to this aspect.

Answered By: a_guest
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.