Select rows in pandas datafame that satisfy a condition in a column that depends on a subset of rows

Question

give the following dataframe I would like to select the rows for each user that has the highest amount:

Name     Amount   ID
--------------------
Alice       100   1
Bob          50   2
Charlie      10   3
Alice        30   4
Charlie      50   5

the result should be:

Name     Amount   ID
--------------------
Alice       100   1
Bob          50   2
Charlie      50   5

how can i do this efficiently?

Asked By: Alberto B

||

Answer 1

Use groupby.apply:

df.groupby('Name', as_index=False).apply(lambda x: x.loc[x['Amount'].eq(x['Amount'].max())]).reset_index(drop=True)

Answered By: Space Impact

Answer 2

You can use idxmax:

df.loc[df.groupby('Name')['Amount'].idxmax()]

output:

      Name  Amount  ID
0    Alice     100   1
1      Bob      50   2
4  Charlie      50   5

If you want to reset_index(), then just add it at the end like:

df.loc[df.groupby('Name')['Amount'].idxmax()].reset_index(drop=True)

output:

      Name  Amount  ID
0    Alice     100   1
1      Bob      50   2
2  Charlie      50   5

Answered By: SomeDude

Question: