How to filter in NaN (pandas)?

Question:

I have a pandas dataframe (df), and I want to do something like:

newdf = df[(df.var1 == 'a') & (df.var2 == NaN)]

I’ve tried replacing NaN with np.NaN, or 'NaN' or 'nan' etc, but nothing evaluates to True. There’s no pd.NaN.

I can use df.fillna(np.nan) before evaluating the above expression but that feels hackish and I wonder if it will interfere with other pandas operations that rely on being able to identify pandas-format NaN’s later.

I get the feeling there should be an easy answer to this question, but somehow it has eluded me. Any advice is appreciated. Thank you.

Asked By: Gerhard

||

Answers:

This doesn’t work because NaN isn’t equal to anything, including NaN. Use pd.isnull(df.var2) instead.

Answered By: Mark Whitfield

Pandas uses numpy‘s NaN value. Use numpy.isnan to obtain a Boolean vector from a pandas series.

Answered By: NicholasM

Simplest of all solutions:

filtered_df = df[df['var2'].isnull()]

This filters and gives you rows which has only NaN values in 'var2' column.

Answered By: Gil Baggio
df[df['var'].isna()]

where

df  : The DataFrame
var : The Column Name
Answered By: Mohammad Shalaby

You can also use query here:

df.query('var2 != var2')

This works since np.nan != np.nan.

Answered By: rachwa
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.