Pandas Keep top N values each group and set others to 0

Question:

I have the following df, which is adapted from here

import pandas as pd    
df = pd.DataFrame({'group':[1,1,1,2,2,2,3,3,3],
               'value':[1,2,3,1,2,3,4,3,2]})

I would like to have the following result:

df1 = pd.DataFrame({'group':[1,1,1,2,2,2,3,3,3],
               'value':[0,2,3,0,2,3,4,3,0]})

The logic is (1) to select top 2 numbers for each group and (2) to set other values to 0.

Any suggestions? Thanks.

Asked By: kevin

||

Answers:

Use nlargest

df.loc[~df.index.isin(df.groupby('group')['value']
                        .apply(lambda s: s.nlargest(2))
                        .index
                        .get_level_values(1)), 'value'] = 0

   group  value
0      1      0
1      1      2
2      1      3
3      2      0
4      2      2
5      2      3
6      3      4
7      3      3
8      3      0
Answered By: rafaelc

You can use a combination of where() and cumcount() to get the results you are expecting

df = pd.DataFrame({'group':[1,1,1,2,2,2,3,3,3],
               'value':[1,2,3,1,2,3,4,3,2]})
df.where(df.sort_values('value', ascending = False).groupby('group')['value'].cumcount() + 1 <= 2, 0)
Answered By: ArchAngelPwn

Using nlargest per group for boolean indexing:

m = df.groupby('group')['value'].apply(lambda g: g.isin(g.nlargest(2)))

df['value'] = df['value'].where(m, 0)

Output (as new column "value2" for clarity)

   group  value  value2
0      1      1       0
1      1      2       2
2      1      3       3
3      2      1       0
4      2      2       2
5      2      3       3
6      3      4       4
7      3      3       3
8      3      2       0
Answered By: mozway
def function1(dd:pd.DataFrame):
    return dd.assign(value=np.where(dd.index.isin(dd.nlargest(2,'value').index),dd.value,0))

df.groupby('group').apply(function1)

out

   group  value
0      1      0
1      1      2
2      1      3
3      2      0
4      2      2
5      2      3
6      3      4
7      3      3
8      3      0
Answered By: G.G
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.