In Python, why did I get -9223372036854775808 when I set one row of a 2D array as np.nan?
Question:
In Python, if I define a 2D array, and set the second row as np.nan, the second row will become all -9223372036854775808 rather than missing values. An example is here:
b = np.array(
[[0, 0, 0, 0, 0, 0, 0, 0, 0, 5],
[0, 3, 4, 4, 6, 6, 6, 5, 4, 5],
[0, 0, 0, 3, 6, 6, 6, 6, 6, 6],
[0, 0, 3, 4, 6, 6, 6, 6, 6, 6],
[0, 1, 2, 4, 4, 4, 4, 4, 4, 4]])
b[1, :] = np.nan
print(b)
[[ 0 0 0
0 0 0
0 0 0
5]
[-9223372036854775808 -9223372036854775808 -9223372036854775808
-9223372036854775808 -9223372036854775808 -9223372036854775808
-9223372036854775808 -9223372036854775808 -9223372036854775808
-9223372036854775808]
[ 0 0 0
3 6 6
6 6 6
6]
[ 0 0 3
4 6 6
6 6 6
6]
[ 0 1 2
4 4 4
4 4 4
4]]
Does anyone have any idea? And how should I correctly assign one row to np.nan?
For your reference, I am running these codes on python 3.7.10 environment created by mamba on Ubuntu 16.04.7 LTS (GNU/Linux 4.15.0-132-generic x86_64).
Answers:
np.nan
is a special floating point value that cannot be used in integer arrays. Since b
is an array of integers, the code b[1, :] = np.nan
attempts to convert np.nan
to an integer, which is an undefined behavior. See this for a discussion of a similar issue.
First of all nan is a special value for float arrays only.
I tried running your code on my python 3.8(64 bit environment) on Windows x-64 based.
b = np.array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 5],
[0, 3, 4, 4, 6, 6, 6, 5, 4, 5],
[0, 0, 0, 3, 6, 6, 6, 6, 6, 6],
[0, 0, 3, 4, 6, 6, 6, 6, 6, 6],
[0, 1, 2, 4, 4, 4, 4, 4, 4, 4]])
b[1, :] = np.nan
print(b)
This is what I got
[[ 0 0 0 0 0 0
0 0 0 5]
[-2147483648 -2147483648 -2147483648 -2147483648 -2147483648 -2147483648
-2147483648 -2147483648 -2147483648 -2147483648]
[ 0 0 0 3 6 6
6 6 6 6]
[ 0 0 3 4 6 6
6 6 6 6]
[ 0 1 2 4 4 4
4 4 4 4]]
In case of int array I got lower bound of int in place of NaN and you are also getting the same depending on your environment.
So instead of int array you can use float array.
b = np.array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 5],
[0, 3, 4, 4, 6, 6, 6, 5, 4, 5],
[0, 0, 0, 3, 6, 6, 6, 6, 6, 6],
[0, 0, 3, 4, 6, 6, 6, 6, 6, 6],
[0, 1, 2, 4, 4, 4, 4, 4, 4, 4]], dtype=float)
You initialised your array with integers. Integers do not have a possible "nan" value and will resort to the minimal value. A quick fix is to initialize your array as np.floats, they are allowed to be "nan":
b = np.array(
[[0, 0, 0, 0, 0, 0, 0, 0, 0, 5],
[0, 3, 4, 4, 6, 6, 6, 5, 4, 5],
[0, 0, 0, 3, 6, 6, 6, 6, 6, 6],
[0, 0, 3, 4, 6, 6, 6, 6, 6, 6],
[0, 1, 2, 4, 4, 4, 4, 4, 4, 4]], dtype=np.float)
b[1, :] = np.nan
print(b)
This number is equal to -2⁶³, which is the minimum value for the 64 bit integer limit. I don’t code in python, but what it seems to me is that your code didn’t recognize NaN as a value so it went to the lowest possible value it could recognize.
In Python, if I define a 2D array, and set the second row as np.nan, the second row will become all -9223372036854775808 rather than missing values. An example is here:
b = np.array(
[[0, 0, 0, 0, 0, 0, 0, 0, 0, 5],
[0, 3, 4, 4, 6, 6, 6, 5, 4, 5],
[0, 0, 0, 3, 6, 6, 6, 6, 6, 6],
[0, 0, 3, 4, 6, 6, 6, 6, 6, 6],
[0, 1, 2, 4, 4, 4, 4, 4, 4, 4]])
b[1, :] = np.nan
print(b)
[[ 0 0 0
0 0 0
0 0 0
5]
[-9223372036854775808 -9223372036854775808 -9223372036854775808
-9223372036854775808 -9223372036854775808 -9223372036854775808
-9223372036854775808 -9223372036854775808 -9223372036854775808
-9223372036854775808]
[ 0 0 0
3 6 6
6 6 6
6]
[ 0 0 3
4 6 6
6 6 6
6]
[ 0 1 2
4 4 4
4 4 4
4]]
Does anyone have any idea? And how should I correctly assign one row to np.nan?
For your reference, I am running these codes on python 3.7.10 environment created by mamba on Ubuntu 16.04.7 LTS (GNU/Linux 4.15.0-132-generic x86_64).
np.nan
is a special floating point value that cannot be used in integer arrays. Since b
is an array of integers, the code b[1, :] = np.nan
attempts to convert np.nan
to an integer, which is an undefined behavior. See this for a discussion of a similar issue.
First of all nan is a special value for float arrays only.
I tried running your code on my python 3.8(64 bit environment) on Windows x-64 based.
b = np.array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 5],
[0, 3, 4, 4, 6, 6, 6, 5, 4, 5],
[0, 0, 0, 3, 6, 6, 6, 6, 6, 6],
[0, 0, 3, 4, 6, 6, 6, 6, 6, 6],
[0, 1, 2, 4, 4, 4, 4, 4, 4, 4]])
b[1, :] = np.nan
print(b)
This is what I got
[[ 0 0 0 0 0 0
0 0 0 5]
[-2147483648 -2147483648 -2147483648 -2147483648 -2147483648 -2147483648
-2147483648 -2147483648 -2147483648 -2147483648]
[ 0 0 0 3 6 6
6 6 6 6]
[ 0 0 3 4 6 6
6 6 6 6]
[ 0 1 2 4 4 4
4 4 4 4]]
In case of int array I got lower bound of int in place of NaN and you are also getting the same depending on your environment.
So instead of int array you can use float array.
b = np.array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 5],
[0, 3, 4, 4, 6, 6, 6, 5, 4, 5],
[0, 0, 0, 3, 6, 6, 6, 6, 6, 6],
[0, 0, 3, 4, 6, 6, 6, 6, 6, 6],
[0, 1, 2, 4, 4, 4, 4, 4, 4, 4]], dtype=float)
You initialised your array with integers. Integers do not have a possible "nan" value and will resort to the minimal value. A quick fix is to initialize your array as np.floats, they are allowed to be "nan":
b = np.array(
[[0, 0, 0, 0, 0, 0, 0, 0, 0, 5],
[0, 3, 4, 4, 6, 6, 6, 5, 4, 5],
[0, 0, 0, 3, 6, 6, 6, 6, 6, 6],
[0, 0, 3, 4, 6, 6, 6, 6, 6, 6],
[0, 1, 2, 4, 4, 4, 4, 4, 4, 4]], dtype=np.float)
b[1, :] = np.nan
print(b)
This number is equal to -2⁶³, which is the minimum value for the 64 bit integer limit. I don’t code in python, but what it seems to me is that your code didn’t recognize NaN as a value so it went to the lowest possible value it could recognize.