How to remove rows with null values from kth column onward in python

Question

I need to remove all rows in which elements from column 3 onwards are all NaN

df = DataFrame(np.random.randn(6, 5), index=['a', 'c', 'e', 'f', 'g','h'], columns=['one', 'two', 'three', 'four', 'five'])

df2 = df.reindex(['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h'])
df2.ix[1][0] = 111
df2.ix[1][1] = 222

In the example above, my final data frame would not be having rows ‘b’ and ‘c’.

How to use df.dropna() in this case?

Asked By: user1140126

||

Source

Answer 1

You can call dropna with arguments subset and how:

df2.dropna(subset=['three', 'four', 'five'], how='all')

As the names suggests:

how='all' requires every column (of subset) in the row to be NaN in order to be dropped, as opposed to the default 'any'.
subset is those columns to inspect for NaNs.

As @PaulH points out, we can generalise to drop the last k columns with:

subset=df2.columns[k:]

Indeed, we could even do something more complicated if desired:

subset=filter(lambda x: len(x) > 3, df2.columns)

Answered By: Andy Hayden

How to remove rows with null values from kth column onward in python

Question:

Answers: