index of non "NaN" values in Pandas
Question:
From Pandas data frame, how to get index of non “NaN” values?
My data frame is
A b c
0 1 q1 1
1 2 NaN 3
2 3 q2 3
3 4 q1 NaN
4 5 q2 7
And I want the index of the rows in which column b is not NaN. (there can be NaN values in other column e.g. c )
non_nana_index = [0,2,3,4]
Using this non “NaN” index list I want to create new data frame which column b do not have “Nan”
df2=
A b c
0 1 q1 1
1 3 q2 3
2 4 q1 NaN
3 5 q2 7
Answers:
Just filter them
In [62]:
df['b'].notnull()
Out[62]:
0 True
1 False
2 True
3 True
4 True
Name: b, dtype: bool
In [63]:
df[df['b'].notnull()]
Out[63]:
A b c
0 1 q1 1
2 3 q2 3
3 4 q1 NaN
4 5 q2 7
DataFrames have a dropna
method:
import pandas
import numpy
d = pandas.DataFrame({'A': [1, 2, 3, numpy.nan],
'b': [1, 2, numpy.nan, 3],
'c': [1, numpy.nan, 2, 3]})
d.dropna(subset=['b'])
You can also use query
here:
In [5]: df.query('b == b')
Out[5]:
A b c
0 1 q1 1.0
2 3 q2 3.0
3 4 q1 NaN
4 5 q2 7.0
This works as NaN when compared to itself returns False:
In [5]: np.nan == np.nan
Out[5]: False
From Pandas data frame, how to get index of non “NaN” values?
My data frame is
A b c
0 1 q1 1
1 2 NaN 3
2 3 q2 3
3 4 q1 NaN
4 5 q2 7
And I want the index of the rows in which column b is not NaN. (there can be NaN values in other column e.g. c )
non_nana_index = [0,2,3,4]
Using this non “NaN” index list I want to create new data frame which column b do not have “Nan”
df2=
A b c
0 1 q1 1
1 3 q2 3
2 4 q1 NaN
3 5 q2 7
Just filter them
In [62]:
df['b'].notnull()
Out[62]:
0 True
1 False
2 True
3 True
4 True
Name: b, dtype: bool
In [63]:
df[df['b'].notnull()]
Out[63]:
A b c
0 1 q1 1
2 3 q2 3
3 4 q1 NaN
4 5 q2 7
DataFrames have a dropna
method:
import pandas
import numpy
d = pandas.DataFrame({'A': [1, 2, 3, numpy.nan],
'b': [1, 2, numpy.nan, 3],
'c': [1, numpy.nan, 2, 3]})
d.dropna(subset=['b'])
You can also use query
here:
In [5]: df.query('b == b')
Out[5]:
A b c
0 1 q1 1.0
2 3 q2 3.0
3 4 q1 NaN
4 5 q2 7.0
This works as NaN when compared to itself returns False:
In [5]: np.nan == np.nan
Out[5]: False