Why will python function max() return different outputs if float('NaN') value is permuted in a dictionary but key-max_value remains the same?

Question:

Let’s pretend I have the following simple dictionary:

dictionary = {'a':3, 'b':4, 'c':float('NaN')}

If I use function max() to return the key with maximum value…

key_maxvalue = max(dictionary, key=dictionary.get)
print(key_maxvalue)

…python outputs this:

b

However, when I permute the values of keys ‘a’ and ‘c’…

dictionary = {'a':float('NaN'), 'b':4, 'c':3}
key_maxvalue = max(dictionary, key=dictionary.get)
print(key_maxvalue)

…I get this unexpected result:

a

I expected python would output ‘b’, as that key still has the maximum value in the dictionary. Why has a change in the values order altered the function max() output? Furthermore, how could I prevent this (unexpected) event from happening?

Asked By: NigelBlainey

||

Answers:

The answer is, "don’t use NaN". The point of an NaN is that it is not a number, and cannot be relied on to act like a number in any rational way. What you’re seeing is that comparisons with NaN are not commutative.

Notice this:

Python 3.10.6 (main, Mar 10 2023, 10:55:28) [GCC 11.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> x = float('NaN')
>>> 1 < x
False
>>> x < 1
False
>>> 

Every comparison with a NaN is false. That makes sorting them indeterminate.

Answered By: Tim Roberts

If you wrote your own function, it might look like this:

def max(nums):
    largest = nums[0]

    for item in nums:
        if item > largest:
            largest = item

    return largest

The problem is this comparison item > largest. Look what happens when you compare a number with np.nan.

Input: np.nan > 4

Output: False

Input: 4 > np.nan

Output: False

Any comparison with a NaN will be False. If max functions like our written function, then it happens what happens in both of your cases. It’s not larger than 4, so b is still the max. However, when it defaults to a in the second case, no other number is larger than NaN, so a remains the max.

Answered By: Michael Cao