apply where if only in rows where condition is met

Question:

How do you apply a groupby in a dataframe only when a condition is met in a row and still keep all other rows that do not meet said condition?

For instance, let’s say I have a dataframe:

Number Version Binary Random column
1 0 True a
1 1 True b
1 2 True c
2 0 False a
2 1 False b

And, with this dataframe, I would like to groupby all of the numbers which binary value is True and keep only the one with the latest version (that is, the largest number), while still respecting and keeping the rows which do not meet the condition described.

That said, it should give a result like this:

Number Version Binary Random column
1 2 True c
2 0 False a
2 1 False b
Asked By: AvocadoToast

||

Answers:

Here is a way using pd.concat() and drop_duplicates()

(pd.concat([df.loc[df.loc[df['Binary']].groupby('Number')['Version'].idxmax()],
df.loc[~df['Binary']]],
ignore_index=True))

Output:

   Number  Version  Binary Random column
0       1        2    True             c
1       2        0   False             a
2       2        1   False             b
Answered By: rhug123

I would use boolean indexing here:

# is the row the max version of its group?
m1 = df['Version'].eq(df.groupby('Binary')['Version'].transform('max'))
# is the binary not True?
m2 = ~df['Binary']

# keep if either condition above is met
df[m1|m2]

Output:

   Number  Version  Binary Random column
2       1        2    True             c
3       2        0   False             a
4       2        1   False             b

alternative

Assuming you would like to perform an expensive operation, you could also apply only if the group matches a condition, else return the group unchanged:

(df.groupby('Binary', group_keys=False, sort=False)
   .apply(lambda g: g.loc[g['Version'].eq(g['Version'].max())]
          if g.name else g)
 )
Answered By: mozway
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.