Select rows in pandas datafame that satisfy a condition in a column that depends on a subset of rows

Question:

give the following dataframe I would like to select the rows for each user that has the highest amount:

Name     Amount   ID
--------------------
Alice       100   1
Bob          50   2
Charlie      10   3
Alice        30   4
Charlie      50   5

the result should be:

Name     Amount   ID
--------------------
Alice       100   1
Bob          50   2
Charlie      50   5

how can i do this efficiently?

Asked By: Alberto B

||

Answers:

Use groupby.apply:

df.groupby('Name', as_index=False).apply(lambda x: x.loc[x['Amount'].eq(x['Amount'].max())]).reset_index(drop=True)
Answered By: Space Impact

You can use idxmax:

df.loc[df.groupby('Name')['Amount'].idxmax()]

output:

      Name  Amount  ID
0    Alice     100   1
1      Bob      50   2
4  Charlie      50   5

If you want to reset_index(), then just add it at the end like:

df.loc[df.groupby('Name')['Amount'].idxmax()].reset_index(drop=True)

output:

      Name  Amount  ID
0    Alice     100   1
1      Bob      50   2
2  Charlie      50   5
Answered By: SomeDude
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.