Pandas groupby and calculate percentage change

Question:

I take reference from How to create rolling percentage for groupby DataFrame

import pandas as pd

data = [
    ('product_a','1/31/2014',53)
    ,('product_b','1/31/2014',44)
    ,('product_c','1/31/2014',36)
    ,('product_a','11/30/2013',52)
    ,('product_b','11/30/2013',43)
    ,('product_c','11/30/2013',35)
    ,('product_a','3/31/2014',50)
    ,('product_b','3/31/2014',41)
    ,('product_c','3/31/2014',34)
    ,('product_a','12/31/2013',50)
    ,('product_b','12/31/2013',41)
    ,('product_c','12/31/2013',34)
    ,('product_a','2/28/2014',52)
    ,('product_b','2/28/2014',43)
    ,('product_c','2/28/2014',35)]

product_df = pd.DataFrame( data, columns=['prod_desc','activity_month','prod_count'] )
product_df.sort_values('activity_month', inplace = True, ascending=False) 
product_df['pct_ch'] = product_df.groupby('prod_desc')['prod_count'].pct_change() + 1

print(product_df)

however, I am not able to produce the output like the suggested answer.

the answer produced

    prod_desc activity_month  prod_count    pct_ch
0   product_a      1/31/2014          53       NaN
1   product_b      1/31/2014          44  0.830189
2   product_c      1/31/2014          36  0.818182
3   product_a     11/30/2013          52  1.444444
4   product_b     11/30/2013          43  0.826923
5   product_c     11/30/2013          35  0.813953
9   product_a     12/31/2013          50  1.428571
10  product_b     12/31/2013          41  0.820000
11  product_c     12/31/2013          34  0.829268
12  product_a      2/28/2014          52  1.529412
13  product_b      2/28/2014          43  0.826923
14  product_c      2/28/2014          35  0.813953
6   product_a      3/31/2014          50  1.428571
7   product_b      3/31/2014          41  0.820000
8   product_c      3/31/2014          34  0.829268

Expected answer should be similar to below, percentage change should be calculated for every prod_desc (product_a, product_b and product_c) instead of one column only

 product_desc activity_month  prod_count    pct_ch
0    product_a     2014-01-01          53       NaN
3    product_a     2014-02-01          26  0.490566
6    product_a     2014-03-01          41  1.576923
1    product_b     2014-01-01          42       NaN
4    product_b     2014-02-01          48  1.142857
7    product_b     2014-03-01          35  0.729167
2    product_c     2014-01-01          38       NaN
5    product_c     2014-02-01          39  1.026316
8    product_c     2014-03-01          50  1.282051

Thank you in advance

Asked By: Platalea Minor

||

Answers:

Use GroupBy.apply with Series.pct_change:

product_df['activity_month'] = pd.to_datetime(product_df['activity_month'])
product_df.sort_values(['prod_desc','activity_month'], inplace = True, ascending=[True, False])

product_df['pct_ch'] = (product_df.groupby('prod_desc')['prod_count']
                                  .apply(pd.Series.pct_change) + 1)
print(product_df)
    prod_desc activity_month  prod_count    pct_ch
6   product_a     2014-03-31          50       NaN
12  product_a     2014-02-28          52  1.040000
0   product_a     2014-01-31          53  1.019231
9   product_a     2013-12-31          50  0.943396
3   product_a     2013-11-30          52  1.040000
7   product_b     2014-03-31          41       NaN
13  product_b     2014-02-28          43  1.048780
1   product_b     2014-01-31          44  1.023256
10  product_b     2013-12-31          41  0.931818
4   product_b     2013-11-30          43  1.048780
8   product_c     2014-03-31          34       NaN
14  product_c     2014-02-28          35  1.029412
2   product_c     2014-01-31          36  1.028571
11  product_c     2013-12-31          34  0.944444
5   product_c     2013-11-30          35  1.029412

In pandas version 1.4.4+ you can use:

df["pct_ch"] = 1 + product_df.groupby("prod_desc")["prod_count"].pct_change()
Answered By: jezrael

In case of mutiple periods, you can use this code:

product_df['pct_ch'] = (product_df.groupby('prod_desc')['prod_count']
                                  .apply(lambda dfi : dfi.pct_change(periods=126)) + 1)
Answered By: tensor
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.