# Union of 2 sets does not contain all items

## Question:

How come when I change the order of the two sets in the unions below, I get different results?

```
set1 = {1, 2, 3}
set2 = {True, False}
print(set1 | set2)
# {False, 1, 2, 3}
print(set2 | set1)
#{False, True, 2, 3}
```

## Answers:

In Python, `False`

and `0`

are considered equivalent, as are `True`

and `1`

. Because `True`

and `1`

are considered the same value, only one of them can be present in a set a the same time. Which one depends on the order they are added to the set in. In the first line, `set1`

is used as the first set, so we get `1`

in the resulting set. In the second set, `True`

is in the first set, so `True`

is included in the result.

The comparison operator (`==`

, `!=`

) is defined for boolean `True`

and `False`

to match 1 and 0.

That’s why, in the set union, when it checks whether `True`

is in the new set already, it gets a truthy answer:

```
>>> True in {1}
True
>>> 1 in {True}
True
```

If you look at https://docs.python.org/3/library/stdtypes.html#boolean-values section 4.12.10. Boolean Values:

Boolean values are the two constant objects

False and True. They are used to represent truth values (although other values can also be considered false or true). In numeric contexts (for example when used as the argument to an arithmetic operator), they behave like theintegers 0 and 1, respectively.

# Why the union() doesn’t contain all items

The `1`

and `True`

are equivalent and considered to be duplicates. Likewise the `0`

and `False`

are equivalent as well:

```
>>> 1 == True
True
>>> 0 == False
True
```

# Which equivalent value is used

When multiple equivalent values are encountered, sets keep the first one seen:

```
>>> {0, False}
{0}
>>> {False, 0}
{False}
```

# Ways to make the values be distinct

To get them to be treated as distinct, just store them in a `(value, type)`

pair:

```
>>> set1 = {(1, int), (2, int), (3, int)}
>>> set2 = {(True, bool), (False, bool)}
>>> set1 | set2
{(3, <class 'int'>), (1, <class 'int'>), (2, <class 'int'>),
(True, <class 'bool'>), (False, <class 'bool'>)}
>>> set1 & set2
set()
```

Another way to make the values distinct is to store them as strings:

```
>>> set1 = {'1', '2', '3'}
>>> set2 = {'True', 'False'}
>>> set1 | set2
{'2', '3', 'False', 'True', '1'}
>>> set1 & set2
set()
```

Hope this clears up the mystery and shows the way forward ðŸ™‚

Rescued from the comments:

This is the standard technique for breaking cross-type equivalence (i.e. `0.0 == 0`

, `True == 1`

, and `Decimal(8.5) == 8.5)`

. The technique is used in Python 2.7’s regular expression module to force unicode regexes to be cached distinctly from otherwise equivalent str regexes. The technique is also used in Python 3 for functools.lru_cache() when the typed parameter is true.

If the OP needs something other than the default equivalence relation, then some new relation needs to be defined. Depending the use case, that could be case-insensitivity for strings, normalization for unicode, visual appearance (things that look different are considered different), identity (no two distinct objects are considered equal), a value/type pair, or some other function that defines an equivalence relation. Given the OPs specific example, it would seem that he/she expected either distinction by type or visual distinction.