Using Lambda in GroupBy .agg to find Mean

Question:

I’m not understanding how to pull the Mean of filtered data within a groupby DataFrame. It works perfectly for sum(), but mean() simply gives me the Percentage of occurrences from the total count.

df = test.groupby(['Code']).agg(
  Count=('% Change', 'count'),
  H2C_Up_Mean=('% C2H',lambda x: (x > 0).mean()),
  H2C_Pct_Up=('% C2H',lambda x: (x > 0).sum())

"H2C_Pct_Up" properly sums when I manually tally in Excel.
"H2C_Up_Mean" gives me "H2C_Pct_Up" DIVIDED by "Count"

How can I pull the ‘Mean’ of when "x > 0" from within a groupby dataframe?

Example Dataframe:

code   % change    % C2H
abc    0.50        0.75
abc    1.00        0.25
abc    -0.25       -0.50

Expected output…

  • Count : 3 (CORRECT)
  • H2C_Pct_Up : 2 (CORRECT)
  • H2C_Up_Mean : 0.50 (ERROR; This produces 0.667)
Asked By: jrcart

||

Answers:

df = test.groupby(['Code']).agg(
  Count=('% Change', 'count'),
  H2C_Up_Mean=('% C2H',lambda x: x[x > 0].mean()),
  H2C_Pct_Up=('% C2H',lambda x: (x > 0).sum()))

gives df as:

      Count  H2C_Up_Mean  H2C_Pct_Up
Code                                
abc       3          0.5           2
Answered By: Frodnar
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.