How should I select which rows I erase in Pandas DataFrame with two conditions?

Question:

I need remove some rows for a DataFrame like this:

import pandas as pd
import numpy as np

input_ = pd.DataFrame()
input_ ['ID'] = [1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4]
input_ ['ST'] = [1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3]
input_ ['V'] = [NaN, NaN, 1, 1, NaN, 1, Nan, 1, NaN, NaN, NaN, NaN]

And finish with a DataFrame like this one:

output_ ['ID'] = [ 2, 3, 4, 2, 3, 4, 2, 3, 4]
output_ ['ST'] = [ 1, 1, 1, 2, 2, 2, 3, 3, 3]
output_ ['V'] = [NaN, 1, 1, 1, Nan, 1, NaN, NaN, NaN]

Where, I had removed the rows with ID == 1, because, this rows have the column V == float(NaN) [np.isnan(V)] for ALL values in the column ST. How should I selec which rows I erase in Pandas DataFrame with this two conditions?.

Answers:

Use groupby().transform('any') to check if the group contains some notna:

valids = input_.V.notna().groupby(input_.ID).transform('any')

output = input_[valids]

Output:

    ID  ST    V
1    2   1  NaN
2    3   1  1.0
3    4   1  1.0
5    2   2  1.0
6    3   2  NaN
7    4   2  1.0
9    2   3  NaN
10   3   3  NaN
11   4   3  NaN
Answered By: Quang Hoang

Try this:

input_ = input_[(input_['ID']!=1) & input_['V'].notnull()]

I’m not sure I fully understand your question and whether you wanted to filter by 1, or if you only did that to get rid of the NaN values. If you don’t want to actually filter by ID==1, just do:

input_ = input_[input_['V'].notnull()]

Output for both:

   ID  ST    V
2   3   1  1.0
3   4   1  1.0
5   2   2  1.0
7   4   2  1.0
Answered By: David Moreau
input_ = pd.DataFrame()

input_ ['ID'] = [1,     2, 3, 4,   1, 2,   3, 4,   1,   2,   3,   4]

input_ ['ST'] = [1,     1, 1, 1,   2, 2,   2, 2,   3,   3,   3,   3]

input_ ['V']  = ['NaN', 'NaN', 1, 1, 'NaN', 1,'Nan', 1, 'NaN', 'NaN', 'NaN', 'NaN']

input_1 = pd.DataFrame(input_)

print(input_1)

input_1.drop(0, inplace = True)
input_1.drop(4, inplace = True)
input_1.drop(8, inplace = True)

print(input_1)