Pandas Groupby get max of multiple columns but in order

Question:

I have below pandas dataframe

COlA    ColB    Result  Freq
A       B       1       3000
A       C       0.2     4000
A       D       1       5000
A       E       0.3     9000
A       F       0.4     8000
B       A       0.4     1000
B       C       0.1     4000
B       D       0.1     5000
B       E       0.2     9000
B       F       0.3     8000
...

I want to groupby ColA and get max of Result and Typesin order i.e. in such a way that it should first find the max of Result column if there are more than one max row then look at Freq and find the max there. I’ve tried using groupby().max().reset_index() but not getting the desired output

Expected Output

COlA    ColB    Result  Freq
A       D       1       5000
B       A       0.4     1000
...
Asked By: Ash

||

Answers:

You can sort by Results/Freq and then groupby+first:

(df.sort_values(by=['Result', 'Freq'], ascending=False)
   .groupby(['COlA'], as_index=False).first()
)

output:

  COlA ColB  Result  Freq
0    A    D     1.0  5000
1    B    A     0.4  1000

NB. warning your column name is COlA (with a capital O)

Answered By: mozway
def function1(dd:pd.DataFrame):
    return dd.sort_values(by=['Result','Freq'],ascending=[False,False]).head(1)

df1.groupby('COlA').apply(function1).reset_index(drop=True)


out

  COlA ColB  Result  Freq
0    A    D     1.0  5000
1    B    A     0.4  1000
Answered By: G.G
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.