convert nan value to zero
Question:
I have a 2D numpy array. Some of the values in this array are NaN
. I want to perform certain operations using this array. For example consider the array:
[[ 0. 43. 67. 0. 38.]
[ 100. 86. 96. 100. 94.]
[ 76. 79. 83. 89. 56.]
[ 88. NaN 67. 89. 81.]
[ 94. 79. 67. 89. 69.]
[ 88. 79. 58. 72. 63.]
[ 76. 79. 71. 67. 56.]
[ 71. 71. NaN 56. 100.]]
I am trying to take each row, one at a time, sort it in reversed order to get max 3 values from the row and take their average. The code I tried is:
# nparr is a 2D numpy array
for entry in nparr:
sortedentry = sorted(entry, reverse=True)
highest_3_values = sortedentry[:3]
avg_highest_3 = float(sum(highest_3_values)) / 3
This does not work for rows containing NaN
. My question is, is there a quick way to convert all NaN
values to zero in the 2D numpy array so that I have no problems with sorting and other things I am trying to do.
Answers:
Where A
is your 2D array:
import numpy as np
A[np.isnan(A)] = 0
The function isnan
produces a bool array indicating where the NaN
values are. A boolean array can by used to index an array of the same shape. Think of it like a mask.
This should work:
from numpy import *
a = array([[1, 2, 3], [0, 3, NaN]])
where_are_NaNs = isnan(a)
a[where_are_NaNs] = 0
In the above case where_are_NaNs is:
In [12]: where_are_NaNs
Out[12]:
array([[False, False, False],
[False, False, True]], dtype=bool)
For your purposes, if all the items are stored as str
and you just use sorted as you are using and then check for the first element and replace it with ‘0’
>>> l1 = ['88','NaN','67','89','81']
>>> n = sorted(l1,reverse=True)
['NaN', '89', '88', '81', '67']
>>> import math
>>> if math.isnan(float(n[0])):
... n[0] = '0'
...
>>> n
['0', '89', '88', '81', '67']
How about nan_to_num()?
nan is never equal to nan
if z!=z:z=0
so for a 2D array
for entry in nparr:
if entry!=entry:entry=0
A code example for drake’s answer to use nan_to_num
:
>>> import numpy as np
>>> A = np.array([[1, 2, 3], [0, 3, np.NaN]])
>>> A = np.nan_to_num(A)
>>> A
array([[ 1., 2., 3.],
[ 0., 3., 0.]])
You can use numpy.nan_to_num :
numpy.nan_to_num(x) : Replace nan with zero and inf with finite numbers.
Example (see doc) :
>>> np.set_printoptions(precision=8)
>>> x = np.array([np.inf, -np.inf, np.nan, -128, 128])
>>> np.nan_to_num(x)
array([ 1.79769313e+308, -1.79769313e+308, 0.00000000e+000,
-1.28000000e+002, 1.28000000e+002])
You could use np.where
to find where you have NaN
:
import numpy as np
a = np.array([[ 0, 43, 67, 0, 38],
[ 100, 86, 96, 100, 94],
[ 76, 79, 83, 89, 56],
[ 88, np.nan, 67, 89, 81],
[ 94, 79, 67, 89, 69],
[ 88, 79, 58, 72, 63],
[ 76, 79, 71, 67, 56],
[ 71, 71, np.nan, 56, 100]])
b = np.where(np.isnan(a), 0, a)
In [20]: b
Out[20]:
array([[ 0., 43., 67., 0., 38.],
[ 100., 86., 96., 100., 94.],
[ 76., 79., 83., 89., 56.],
[ 88., 0., 67., 89., 81.],
[ 94., 79., 67., 89., 69.],
[ 88., 79., 58., 72., 63.],
[ 76., 79., 71., 67., 56.],
[ 71., 71., 0., 56., 100.]])
You can use lambda function, an example for 1D array:
import numpy as np
a = [np.nan, 2, 3]
map(lambda v:0 if np.isnan(v) == True else v, a)
This will give you the result:
[0, 2, 3]
I have a 2D numpy array. Some of the values in this array are NaN
. I want to perform certain operations using this array. For example consider the array:
[[ 0. 43. 67. 0. 38.]
[ 100. 86. 96. 100. 94.]
[ 76. 79. 83. 89. 56.]
[ 88. NaN 67. 89. 81.]
[ 94. 79. 67. 89. 69.]
[ 88. 79. 58. 72. 63.]
[ 76. 79. 71. 67. 56.]
[ 71. 71. NaN 56. 100.]]
I am trying to take each row, one at a time, sort it in reversed order to get max 3 values from the row and take their average. The code I tried is:
# nparr is a 2D numpy array
for entry in nparr:
sortedentry = sorted(entry, reverse=True)
highest_3_values = sortedentry[:3]
avg_highest_3 = float(sum(highest_3_values)) / 3
This does not work for rows containing NaN
. My question is, is there a quick way to convert all NaN
values to zero in the 2D numpy array so that I have no problems with sorting and other things I am trying to do.
Where A
is your 2D array:
import numpy as np
A[np.isnan(A)] = 0
The function isnan
produces a bool array indicating where the NaN
values are. A boolean array can by used to index an array of the same shape. Think of it like a mask.
This should work:
from numpy import *
a = array([[1, 2, 3], [0, 3, NaN]])
where_are_NaNs = isnan(a)
a[where_are_NaNs] = 0
In the above case where_are_NaNs is:
In [12]: where_are_NaNs
Out[12]:
array([[False, False, False],
[False, False, True]], dtype=bool)
For your purposes, if all the items are stored as str
and you just use sorted as you are using and then check for the first element and replace it with ‘0’
>>> l1 = ['88','NaN','67','89','81']
>>> n = sorted(l1,reverse=True)
['NaN', '89', '88', '81', '67']
>>> import math
>>> if math.isnan(float(n[0])):
... n[0] = '0'
...
>>> n
['0', '89', '88', '81', '67']
How about nan_to_num()?
nan is never equal to nan
if z!=z:z=0
so for a 2D array
for entry in nparr:
if entry!=entry:entry=0
A code example for drake’s answer to use nan_to_num
:
>>> import numpy as np
>>> A = np.array([[1, 2, 3], [0, 3, np.NaN]])
>>> A = np.nan_to_num(A)
>>> A
array([[ 1., 2., 3.],
[ 0., 3., 0.]])
You can use numpy.nan_to_num :
numpy.nan_to_num(x) : Replace nan with zero and inf with finite numbers.
Example (see doc) :
>>> np.set_printoptions(precision=8)
>>> x = np.array([np.inf, -np.inf, np.nan, -128, 128])
>>> np.nan_to_num(x)
array([ 1.79769313e+308, -1.79769313e+308, 0.00000000e+000,
-1.28000000e+002, 1.28000000e+002])
You could use np.where
to find where you have NaN
:
import numpy as np
a = np.array([[ 0, 43, 67, 0, 38],
[ 100, 86, 96, 100, 94],
[ 76, 79, 83, 89, 56],
[ 88, np.nan, 67, 89, 81],
[ 94, 79, 67, 89, 69],
[ 88, 79, 58, 72, 63],
[ 76, 79, 71, 67, 56],
[ 71, 71, np.nan, 56, 100]])
b = np.where(np.isnan(a), 0, a)
In [20]: b
Out[20]:
array([[ 0., 43., 67., 0., 38.],
[ 100., 86., 96., 100., 94.],
[ 76., 79., 83., 89., 56.],
[ 88., 0., 67., 89., 81.],
[ 94., 79., 67., 89., 69.],
[ 88., 79., 58., 72., 63.],
[ 76., 79., 71., 67., 56.],
[ 71., 71., 0., 56., 100.]])
You can use lambda function, an example for 1D array:
import numpy as np
a = [np.nan, 2, 3]
map(lambda v:0 if np.isnan(v) == True else v, a)
This will give you the result:
[0, 2, 3]