Get Top 3 max values

Question:

I used to have a list and only needed to extract the max values in column 33 every day using below code and then export the data.

df_= pd.read_excel (r'file_location.xlsx')
df['Date'] = pd.to_datetime(df['Date'], errors='coerce')


df_new = (df.groupby(pd.Grouper(key="Date",freq="D"))
            .agg({df.columns[33]: np.max})
            .reset_index())

Now I have a new task to extract the top 3 valus in the same column everyday. I tried below code but doesn’t work.

Any idea?



df_= pd.read_excel (r'file_location.xlsx')
df['Date'] = pd.to_datetime(df['Date'], errors='coerce')


df_new = (df.groupby(pd.Grouper(key="Date",freq="D"))
            .agg({df.columns[33]: np.head(3)})
            .reset_index())



Asked By: PJL

||

Answers:

You need specify column after groupby and call GroupBy.head without agg:

df_e_new = df.groupby(pd.Grouper(key="Date",freq="D"))[df.columns[33]].head(3)
      

Or use SeriesGroupBy.nlargest for top3 sorted values:

df_e_new = df.groupby(pd.Grouper(key="Date",freq="D"))[df.columns[33]].nlargest(3)

For sum top3 values use lambda function:

df_e_new = (df.groupby(pd.Grouper(key="Date",freq="D"))[df.columns[33]]
              .agg(lambda x: x.nlargest(3).sum())
              .reset_index(name='top3sum'))
Answered By: jezrael
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.