Merging a pandas groupby result back into DataFrame

Question:

I have a DataFrame that looks like this…

   idn value  
0  ID1    25
1  ID1    30
2  ID2    30
3  ID2    50

I want to add another column to this frame that is the max ‘value’ grouped by ‘idn’

I want a result that looks like this.

   idn value  max_val
0  ID1    25       30
1  ID1    30       30
2  ID2    30       50
3  ID2    50       50

I can extract the max of ‘value’ using a group by like so…

df[['idn', 'value']].groupby('idn')['value'].max()

However, I am unable to merge that result back into the original DataFrame.

What is the best way to get the desired result?

Thank You

Asked By: Rob Kulseth

||

Answers:

set the index of df to idn, and then use df.merge. after the merge, reset the index and rename columns

dfmax = df.groupby('idn')['value'].max()

df.set_index('idn', inplace=True)

df = df.merge(dfmax, how='outer', left_index=True, right_index=True)

df.reset_index(inplace=True)

df.columns = ['idn', 'value', 'max_value']
Answered By: Haleemur Ali

Use the transform method on a groupby object:

In [5]: df['maxval'] = df.groupby(by=['idn']).transform('max')

In [6]: df
Out[6]: 
   idn  value  maxval
0  ID1     25      30
1  ID1     30      30
2  ID2     30      50
3  ID2     50      50
Answered By: Paul H

For anyone else that is interested. The one-liner method for this is to use reset_index.

as such:

df['value_max'] = df[['idn', 'value']].groupby('idn')['value'].max().reset_index(['idn', 'value'], drop = True)
Answered By: ur_mom
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.