Checking if particular value (in cell) is NaN in pandas DataFrame not working using ix or iloc
Question:
Lets say I have following pandas
DataFrame
:
import pandas as pd
df = pd.DataFrame({"A":[1,pd.np.nan,2], "B":[5,6,0]})
Which would look like:
>>> df
A B
0 1.0 5
1 NaN 6
2 2.0 0
First option
I know one way to check if a particular value is NaN
, which is as follows:
>>> df.isnull().ix[1,0]
True
Second option (not working)
I thought below option, using ix
, would work as well, but it’s not:
>>> df.ix[1,0]==pd.np.nan
False
I also tried iloc
with same results:
>>> df.iloc[1,0]==pd.np.nan
False
However if I check for those values using ix
or iloc
I get:
>>> df.ix[1,0]
nan
>>> df.iloc[1,0]
nan
So, why is the second option not working? Is it possible to check for NaN
values using ix
or iloc
?
Answers:
Try this:
In [107]: pd.isnull(df.iloc[1,0])
Out[107]: True
UPDATE: in a newer Pandas versions use pd.isna():
In [7]: pd.isna(df.iloc[1,0])
Out[7]: True
The above answer is excellent. Here is the same with an example for better understanding.
>>> import pandas as pd
>>>
>>> import numpy as np
>>>
>>> pd.Series([np.nan, 34, 56])
0 NaN
1 34.0
2 56.0
dtype: float64
>>>
>>> s = pd.Series([np.nan, 34, 56])
>>> pd.isnull(s[0])
True
>>>
I also tried couple of times, the following trials did not work. Thanks to @MaxU
.
>>> s[0]
nan
>>>
>>> s[0] == np.nan
False
>>>
>>> s[0] is np.nan
False
>>>
>>> s[0] == 'nan'
False
>>>
>>> s[0] == pd.np.nan
False
>>>
pd.isna(cell_value)
can be used to check if a given cell value is nan. Alternatively, pd.notna(cell_value)
to check the opposite.
From source code of pandas:
def isna(obj):
"""
Detect missing values for an array-like object.
This function takes a scalar or array-like object and indicates
whether values are missing (``NaN`` in numeric arrays, ``None`` or ``NaN``
in object arrays, ``NaT`` in datetimelike).
Parameters
----------
obj : scalar or array-like
Object to check for null or missing values.
Returns
-------
bool or array-like of bool
For scalar input, returns a scalar boolean.
For array input, returns an array of boolean indicating whether each
corresponding element is missing.
See Also
--------
notna : Boolean inverse of pandas.isna.
Series.isna : Detect missing values in a Series.
DataFrame.isna : Detect missing values in a DataFrame.
Index.isna : Detect missing values in an Index.
Examples
--------
Scalar arguments (including strings) result in a scalar boolean.
>>> pd.isna('dog')
False
>>> pd.isna(np.nan)
True
I made up some workaround:
x = [np.nan]
In [4]: x[0] == np.nan
Out[4]: False
but:
In [5]: np.nan in x
Out[5]: True
You can see list contain method implementation, to understand why it works.
df.isnull().loc[1,0]
I tried the above syntax and it worked.
Lets say I have following pandas
DataFrame
:
import pandas as pd
df = pd.DataFrame({"A":[1,pd.np.nan,2], "B":[5,6,0]})
Which would look like:
>>> df
A B
0 1.0 5
1 NaN 6
2 2.0 0
First option
I know one way to check if a particular value is NaN
, which is as follows:
>>> df.isnull().ix[1,0]
True
Second option (not working)
I thought below option, using ix
, would work as well, but it’s not:
>>> df.ix[1,0]==pd.np.nan
False
I also tried iloc
with same results:
>>> df.iloc[1,0]==pd.np.nan
False
However if I check for those values using ix
or iloc
I get:
>>> df.ix[1,0]
nan
>>> df.iloc[1,0]
nan
So, why is the second option not working? Is it possible to check for NaN
values using ix
or iloc
?
Try this:
In [107]: pd.isnull(df.iloc[1,0])
Out[107]: True
UPDATE: in a newer Pandas versions use pd.isna():
In [7]: pd.isna(df.iloc[1,0])
Out[7]: True
The above answer is excellent. Here is the same with an example for better understanding.
>>> import pandas as pd
>>>
>>> import numpy as np
>>>
>>> pd.Series([np.nan, 34, 56])
0 NaN
1 34.0
2 56.0
dtype: float64
>>>
>>> s = pd.Series([np.nan, 34, 56])
>>> pd.isnull(s[0])
True
>>>
I also tried couple of times, the following trials did not work. Thanks to @MaxU
.
>>> s[0]
nan
>>>
>>> s[0] == np.nan
False
>>>
>>> s[0] is np.nan
False
>>>
>>> s[0] == 'nan'
False
>>>
>>> s[0] == pd.np.nan
False
>>>
pd.isna(cell_value)
can be used to check if a given cell value is nan. Alternatively, pd.notna(cell_value)
to check the opposite.
From source code of pandas:
def isna(obj):
"""
Detect missing values for an array-like object.
This function takes a scalar or array-like object and indicates
whether values are missing (``NaN`` in numeric arrays, ``None`` or ``NaN``
in object arrays, ``NaT`` in datetimelike).
Parameters
----------
obj : scalar or array-like
Object to check for null or missing values.
Returns
-------
bool or array-like of bool
For scalar input, returns a scalar boolean.
For array input, returns an array of boolean indicating whether each
corresponding element is missing.
See Also
--------
notna : Boolean inverse of pandas.isna.
Series.isna : Detect missing values in a Series.
DataFrame.isna : Detect missing values in a DataFrame.
Index.isna : Detect missing values in an Index.
Examples
--------
Scalar arguments (including strings) result in a scalar boolean.
>>> pd.isna('dog')
False
>>> pd.isna(np.nan)
True
I made up some workaround:
x = [np.nan]
In [4]: x[0] == np.nan
Out[4]: False
but:
In [5]: np.nan in x
Out[5]: True
You can see list contain method implementation, to understand why it works.
df.isnull().loc[1,0]
I tried the above syntax and it worked.