get max value after grouping two columns pandas

Question

I want to get the trip_time with the highest count for each age_group

age_group	trip_time	counts
18 – 30yrs	01am	23
18 – 30yrs	02am	2
18 – 30yrs	03am	213
31 – 50yrs	01am	74
31 – 50yrs	02am	211
31 – 50yrs	03am	852
51 – 70yrs	01am	23
51 – 70yrs	02am	11
51 – 70yrs	03am	101

Expected output:

age_group	trip_time	counts
18 – 30yrs	03am	213
31 – 50yrs	03am	852
51 – 70yrs	03am	101

trip_time_age_group.groupby(['age_group', 'trip_time'])['counts'].max()

But it gives me wrong result

Asked By: Khola

||

Source

Answer 1

groupby ‘age-group’ and then use transform to get the max count for each age-group, which you then compare with the count of the DF, to get the resultset

df.loc[df.groupby('age_group')['counts'].transform('max').eq(df['counts'])]

    age_group trip_time  counts
2  18 - 30yrs      03am     213
5  31 - 50yrs      03am     852
8  51 - 70yrs      03am     101

Answered By: Naveed

Answer 2

Groupby only age_group, then find the rows in each group with the max counts.

df.groupby('age_group').apply(
    lambda sf: sf.loc[sf['counts'] == sf['counts'].max()]
).reset_index(drop=True)

    age_group trip_time  counts
0  18 - 30yrs      03am     213
1  31 - 50yrs      03am     852
2  51 - 70yrs      03am     101

You could also do sf.query('counts == counts.max()') instead of sf.loc[...].

Answered By: wjandrea

get max value after grouping two columns pandas

Question:

Answers: