Calculating summation over months of pandas dataframe

Question:

I have a pandas dataframe given below:

ID       Year       R1  R1_f
KAR1    20201001    1   5
KAR1    20201101    2   6
KAR1    20201201    3   7
KAR1    20210101    4   8
KAR1    20210201    5   9
KAR1    20210301    6   10
KAR1    20210401    7   11
KAR1    20210501    8   12
KAR1    20210601    9   13
KAR1    20210701    10  14
KAR1    20210801    11  15
KAR1    20210901    12  16
KAR2    20201001    4   9
KAR2    20201101    3   8
KAR2    20201201    2   7
KAR2    20210101    1   6
KAR2    20210201    9   5
KAR2    20210301    2   4
KAR2    20210401    6   3
KAR2    20210501    5   2
KAR2    20210601    3   1
KAR2    20210701    30  2
KAR2    20210801    34  3
KAR2    20210901    20  4

I need to transform above dataframe as given below:

    ID Year      R1_sum 3m_R1 6m_R1 9m_R1 12m_R1 R1_f 3m_R1_f 6m_R1_f 9m_R1_f 12m_R1_f 
   KAR1 20210901   12      33    57    72    78    16    45     81      108      126 
   KAR2 20210901   20      84    98    110   119    4     9      15      30        54

In above output dataframe:

R1_sum is having value equal to value in year 20210901 for both Id’s.

3m_R1 is the summation of values of 3 months 20210901 to 20210701 for column R1

6m_R1 is the summation of values of 6 months from 20210901 to 20210401 for column R1

9m_R1 is the summation of values of 9 months from 20210901 to 20210101 for column R1

12m_R1 is the summation of values of 12 months from 20210901 to 20201001 for column R1

R1_f is having value equal to value in year 20210901 for both Id’s.

3m_R1_f is the summation of values of 3 months 20210901 to 20210701 for column R1_f

6m_R1_f is the summation of values of 6 months from 20210901 to 20210401 for column R1_f

9m_R1_f is the summation of values of 9 months from 20210901 to 20210101 for column R1_f

12m_R1_f is the summation of values of 12 months from 20210901 to 20201001 for column R1_f

Please help

Asked By: Python_help

||

Answers:

For count from last months to first months per groups ID first crate helper groups by GroupBy.cumcount and aggregate sum, then aggregate sum with GroupBy.cumsum, reshape by DataFrame.unstack, flatten MultiIndex in columns and add to DataFrame created by GroupBy.last:

Data:

df = pd.DataFrame({'ID': ['KAR1', 'KAR1', 'KAR1', 'KAR1', 'KAR1', 'KAR1', 'KAR1', 
                          'KAR1', 'KAR1', 'KAR1', 'KAR1', 'KAR1', 'KAR2', 'KAR2', 'KAR2', 'KAR2', 'KAR2', 'KAR2', 'KAR2', 'KAR2', 'KAR2', 'KAR2', 'KAR2', 'KAR2'], 
                   'Year': [20201001, 20201101, 20201201, 20210101, 20210201, 20210301,
                            20210401, 20210501, 20210601, 20210701, 20210801, 20210901, 
                            20201001, 20201101, 20201201, 20210101, 20210201, 20210301,
                            20210401, 20210501, 20210601, 20210701, 20210801, 20210901], 
                   'R1': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 4, 3, 2, 1, 9, 2, 6, 5, 3, 30, 34, 20], 
                   'R1_f': [5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 9, 8, 7, 6, 5, 4, 3, 2, 1, 2, 3, 4]})
    
print (df)

cols = ['R1','R1_f']
g = df.groupby('ID').cumcount(ascending=False) // 3
df1 = df.groupby(['ID',g])[cols].sum().groupby(level=0).cumsum().unstack()
print (df1)
      R1               R1_f              
       0   1    2    3    0   1    2    3
ID                                       
KAR1  33  57   72   78   45  81  108  126
KAR2  84  98  110  119    9  15   30   54

df2 = df.groupby('ID')[['Year'] + cols].last()
df2.columns = pd.MultiIndex.from_product([df2.columns, [-1]])
print (df2)
            -1  -1   -1
ID                     
KAR1  20210901  12   16
KAR2  20210901  20    4

df = df2.join(df1).sort_index(axis=1)
df.columns = [f'{(b + 1) * 3}m_{a}' if b!=-1 else f'{a}_sum' for a, b in df.columns]
df = df.reset_index()
df.insert(1, 'Year', df.pop('Year_sum'))
print (df)
     ID      Year  R1_sum  3m_R1  6m_R1  9m_R1  12m_R1  R1_f_sum  3m_R1_f  
0  KAR1  20210901      12     33     57     72      78        16       45   
1  KAR2  20210901      20     84     98    110     119         4        9   

   6m_R1_f  9m_R1_f  12m_R1_f  
0       81      108       126  
1       15       30        54  
Answered By: jezrael