Filtering DataFrame on groups where all elements of one group fullfills a one of various conditions

Question:

I need to filter a data frame with different groups. The data frame looks as follows:

df = pd.DataFrame({"group":[1,1,1,
                            2,2,2,2,
                            3,3,3,
                            4,4],
                   "percentage":[70,70,70,
                                 45,80,60,70,
                                 71,85,90,
                                 np.nan, np.nan]})

My goal is to return a data frame containing only groups that satisfy one of the two following conditions:

  1. All observations of the group have percentage > 70
  2. All observations of the group are np.nan

I know that I have to group the data frame first and then apply the conditions. This might be easily done using a for loop for groups. However, using such a solution might be very slow. Any help would be appreciated.

Asked By: MF1992

||

Answers:

You can try with filter

df = df.groupby('group').filter(lambda x : x['percentage'].gt(70).all() |  x['percentage'].isna().all() )
Out[25]: 
    group  percentage
7       3        71.0
8       3        85.0
9       3        90.0
10      4         NaN
11      4         NaN
Answered By: BENY

USE- df[(df.percentage > 70) | df['percentage'].isnull()]

Output-

    group   percentage
4   2         80.0
7   3         71.0
8   3         85.0
9   3         90.0
10  4         NaN
11  4         NaN
Answered By: Divyank
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.