Check for None in pandas dataframe
Question:
I would like to find where None is found in the dataframe.
pd.DataFrame([None,np.nan]).isnull()
OUT:
0
0 True
1 True
isnull() finds both numpy Nan and None values.
I only want the None values and not numpy Nan. Is there an easier way to do that without looping through the dataframe?
Edit:
After reading the comments, I realized that in my dataframe in my work also include strings, so the None were not coerced to numpy Nan. So the answer given by Pisdom works.
Answers:
You could use applymap
with a lambda
to check if an element is None
as follows, (constructed a different example, as in your original one, None
is coerced to np.nan
because the data type is float
, you will need an object
type column to hold None
as is, or as commented by @Evert, None
and NaN
are indistinguishable in numeric type columns):
df = pd.DataFrame([[None, 3], ["", np.nan]])
df
# 0 1
#0 None 3.0
#1 NaN
df.applymap(lambda x: x is None)
# 0 1
#0 True False
#1 False False
If you want to get True/False for each line, you can use the following code. Here is an example as a result for the following DataFrame:
df = pd.DataFrame([[None, 3], ["", np.nan]])
df
# 0 1
#0 None 3.0
#1 NaN
How to check None
Available: .isnull()
>>> df[0].isnull()
0 True
1 False
Name: 0, dtype: bool
Available: .apply
==
or is
None
>>> df[0].apply(lambda x: x == None)
0 True
1 False
Name: 0, dtype: bool
>>> df[0].apply(lambda x: x is None)
0 True
1 False
Name: 0, dtype: bool
Available: .values
==
None
>>> df[0].values == None
array([ True, False])
Unavailable: is
or ==
>>> df[0] is None
False
>>> df[0] == None
0 False
1 False
Name: 0, dtype: bool
Unavailable: .values
is
None
>>> df[0].values is None
False
How to check np.nan
Available: .isnull()
>>> df[1].isnull()
0 False
1 True
Name: 1, dtype: bool
Available: np.isnan
>>> np.isnan(df[1])
0 False
1 True
Name: 1, dtype: bool
>>> np.isnan(df[1].values)
array([False, True])
>>> df[1].apply(lambda x: np.isnan(x))
0 False
1 True
Name: 1, dtype: bool
Unavailable: is
or ==
np.nan
>>> df[1] is np.nan
False
>>> df[1] == np.nan
0 False
1 False
Name: 1, dtype: bool
>>> df[1].values is np.nan
False
>>> df[1].values == np.nan
array([False, False])
>>> df[1].apply(lambda x: x is np.nan)
0 False
1 False
Name: 1, dtype: bool
>>> df[1].apply(lambda x: x == np.nan)
0 False
1 False
Name: 1, dtype: bool
Q: How check for None
in DataFrame / Series
A: isna
works but also catches nan
. Two suggestions:
- Use
x.isna()
and replace none with nan
- If you really care about
None
: x.applymap(type) == type(None)
I prefer comparing type since for example nan == nan
is false.
In my case the None
s appeared unintentionally so x[x.isna()] = nan
solved the problem.
Example:
x = pd.DataFrame([12, False, 0, nan, None]).T
x.isna()
0 1 2 3 4
0 False False False True True
x.applymap(type) == type(None)
0 1 2 3 4
0 False False False False True
x
0 1 2 3 4
0 12 False 0 NaN None
x[x.isna()] = nan
0 1 2 3 4
0 12 False 0 NaN NaN
I would like to find where None is found in the dataframe.
pd.DataFrame([None,np.nan]).isnull()
OUT:
0
0 True
1 True
isnull() finds both numpy Nan and None values.
I only want the None values and not numpy Nan. Is there an easier way to do that without looping through the dataframe?
Edit:
After reading the comments, I realized that in my dataframe in my work also include strings, so the None were not coerced to numpy Nan. So the answer given by Pisdom works.
You could use applymap
with a lambda
to check if an element is None
as follows, (constructed a different example, as in your original one, None
is coerced to np.nan
because the data type is float
, you will need an object
type column to hold None
as is, or as commented by @Evert, None
and NaN
are indistinguishable in numeric type columns):
df = pd.DataFrame([[None, 3], ["", np.nan]])
df
# 0 1
#0 None 3.0
#1 NaN
df.applymap(lambda x: x is None)
# 0 1
#0 True False
#1 False False
If you want to get True/False for each line, you can use the following code. Here is an example as a result for the following DataFrame:
df = pd.DataFrame([[None, 3], ["", np.nan]])
df
# 0 1
#0 None 3.0
#1 NaN
How to check None
Available: .isnull()
>>> df[0].isnull()
0 True
1 False
Name: 0, dtype: bool
Available: .apply
==
or is
None
>>> df[0].apply(lambda x: x == None)
0 True
1 False
Name: 0, dtype: bool
>>> df[0].apply(lambda x: x is None)
0 True
1 False
Name: 0, dtype: bool
Available: .values
==
None
>>> df[0].values == None
array([ True, False])
Unavailable: is
or ==
>>> df[0] is None
False
>>> df[0] == None
0 False
1 False
Name: 0, dtype: bool
Unavailable: .values
is
None
>>> df[0].values is None
False
How to check np.nan
Available: .isnull()
>>> df[1].isnull()
0 False
1 True
Name: 1, dtype: bool
Available: np.isnan
>>> np.isnan(df[1])
0 False
1 True
Name: 1, dtype: bool
>>> np.isnan(df[1].values)
array([False, True])
>>> df[1].apply(lambda x: np.isnan(x))
0 False
1 True
Name: 1, dtype: bool
Unavailable: is
or ==
np.nan
>>> df[1] is np.nan
False
>>> df[1] == np.nan
0 False
1 False
Name: 1, dtype: bool
>>> df[1].values is np.nan
False
>>> df[1].values == np.nan
array([False, False])
>>> df[1].apply(lambda x: x is np.nan)
0 False
1 False
Name: 1, dtype: bool
>>> df[1].apply(lambda x: x == np.nan)
0 False
1 False
Name: 1, dtype: bool
Q: How check for None
in DataFrame / Series
A: isna
works but also catches nan
. Two suggestions:
- Use
x.isna()
and replace none with nan - If you really care about
None
:x.applymap(type) == type(None)
I prefer comparing type since for example nan == nan
is false.
In my case the None
s appeared unintentionally so x[x.isna()] = nan
solved the problem.
Example:
x = pd.DataFrame([12, False, 0, nan, None]).T
x.isna()
0 1 2 3 4
0 False False False True True
x.applymap(type) == type(None)
0 1 2 3 4
0 False False False False True
x
0 1 2 3 4
0 12 False 0 NaN None
x[x.isna()] = nan
0 1 2 3 4
0 12 False 0 NaN NaN