How to calculate last two week sum for each group ID
Question:
** I have Input data frame **
ID
Date
Amount
A
2021-08-03
100
A
2021-08-04
100
A
2021-08-06
20
A
2021-08-07
100
A
2021-08-09
300
A
2021-08-11
100
A
2021-08-12
100
A
2021-08-13
10
A
2021-08-23
10
A
2021-08-24
10
A
2021-08-26
10
A
2021-08-28
10
desired Output data frame
ID
Date
Amount
TwoWeekSum
A
2021-08-03
100
320
A
2021-08-04
100
320
A
2021-08-06
20
320
A
2021-08-07
100
320
A
2021-08-09
300
830
A
2021-08-11
100
830
A
2021-08-12
100
830
A
2021-08-13
10
830
A
2021-08-23
10
40
A
2021-08-24
10
40
A
2021-08-26
10
40
A
2021-08-28
10
40
I want to calculate the last two week total sum like
twoweekSum= current week total sum + Previous Week total sum i.e. current week is 34 then twoweekSum is 34 week total sum + 33 week total sum.
Please help me in to get in this in like output data frame so I can use that for further analysis.
Thank You folks !
Answers:
Try using groupby
to group the dataframe by dt.week
(each week), then use transform
sum
to add up the values weekly and repeat the values:
df['TwoWeekSum'] = df.groupby(df['Date'].dt.week)['Amount'].transform('sum')
And then:
print(df)
Gives:
ID Date Amount TwoWeekSum
0 A 2021-08-03 100 320
1 A 2021-08-04 100 320
2 A 2021-08-06 20 320
3 A 2021-08-07 100 320
4 A 2021-08-09 300 830
5 A 2021-08-11 100 830
6 A 2021-08-12 100 830
7 A 2021-08-13 10 830
8 A 2021-08-23 10 40
9 A 2021-08-24 10 40
10 A 2021-08-26 10 40
11 A 2021-08-28 10 40
Use:
#convert values to datetimes
df['Date'] = pd.to_datetime(df['Date'])
#convert values to weeks
df['week'] = df['Date'].dt.isocalendar().week
#aggregate sum per ID and weeks, then add missing weeks and sum in rolling
f = lambda x: x.reindex(range(x.index.min(), x.index.max() + 1))
.rolling(2, min_periods=1).sum()
df1 = df.groupby(['ID', 'week'])['Amount'].sum().reset_index(level=0).groupby('ID').apply(f)
print (df1)
Amount
ID week
A 31 320.0
32 830.0
33 510.0
34 40.0
#last add to original DataFrame per ID and weeks
df=df.join(df1.rename(columns={'Amount':'TwoWeekSum'}),on=['ID','week']).drop('week',axis=1)
print (df)
ID Date Amount TwoWeekSum
0 A 2021-08-03 100 320.0
1 A 2021-08-04 100 320.0
2 A 2021-08-06 20 320.0
3 A 2021-08-07 100 320.0
4 A 2021-08-09 300 830.0
5 A 2021-08-11 100 830.0
6 A 2021-08-12 100 830.0
7 A 2021-08-13 10 830.0
8 A 2021-08-23 10 40.0
9 A 2021-08-24 10 40.0
10 A 2021-08-26 10 40.0
11 A 2021-08-28 10 40.0
per = pd.period_range(df['Date'].min(), df['Date'].max(), freq='w')
mapper = df.groupby(df['Date'].astype('Period[W]')).sum().reindex(per, fill_value=0).rolling(2, 1).sum()['Amount']
out = df['Date'].astype('Period[W]').map(mapper)
out
0 320.0
1 320.0
2 320.0
3 320.0
4 830.0
5 830.0
6 830.0
7 830.0
8 40.0
9 40.0
10 40.0
11 40.0
Name: Date, dtype: float64
make out
to TwoWeekSum column
df.assign(TwoWeekSum=out)
ID Date Amount TwoWeekSum
0 A 2021-08-03 100 320.0
1 A 2021-08-04 100 320.0
2 A 2021-08-06 20 320.0
3 A 2021-08-07 100 320.0
4 A 2021-08-09 300 830.0
5 A 2021-08-11 100 830.0
6 A 2021-08-12 100 830.0
7 A 2021-08-13 10 830.0
8 A 2021-08-23 10 40.0
9 A 2021-08-24 10 40.0
10 A 2021-08-26 10 40.0
11 A 2021-08-28 10 40.0
Update
if each ID
, groupby and merge
per = pd.period_range(df['Date'].min(), df['Date'].max(), freq='w')
s = df['Date'].astype('Period[W]')
idx = pd.MultiIndex.from_product([df['ID'].unique(), per])
df1 = df.groupby(['ID', s]).sum().reindex(idx, fill_value=0).rolling(2, 1).agg(sum).reset_index().set_axis(['ID', 'period', 'TwoWeekSum'], axis=1)
df.assign(period=s).merge(df1, how='left').drop('period', axis=1)
** I have Input data frame **
ID | Date | Amount |
---|---|---|
A | 2021-08-03 | 100 |
A | 2021-08-04 | 100 |
A | 2021-08-06 | 20 |
A | 2021-08-07 | 100 |
A | 2021-08-09 | 300 |
A | 2021-08-11 | 100 |
A | 2021-08-12 | 100 |
A | 2021-08-13 | 10 |
A | 2021-08-23 | 10 |
A | 2021-08-24 | 10 |
A | 2021-08-26 | 10 |
A | 2021-08-28 | 10 |
desired Output data frame
ID | Date | Amount | TwoWeekSum |
---|---|---|---|
A | 2021-08-03 | 100 | 320 |
A | 2021-08-04 | 100 | 320 |
A | 2021-08-06 | 20 | 320 |
A | 2021-08-07 | 100 | 320 |
A | 2021-08-09 | 300 | 830 |
A | 2021-08-11 | 100 | 830 |
A | 2021-08-12 | 100 | 830 |
A | 2021-08-13 | 10 | 830 |
A | 2021-08-23 | 10 | 40 |
A | 2021-08-24 | 10 | 40 |
A | 2021-08-26 | 10 | 40 |
A | 2021-08-28 | 10 | 40 |
I want to calculate the last two week total sum like
twoweekSum= current week total sum + Previous Week total sum i.e. current week is 34 then twoweekSum is 34 week total sum + 33 week total sum.
Please help me in to get in this in like output data frame so I can use that for further analysis.
Thank You folks !
Try using groupby
to group the dataframe by dt.week
(each week), then use transform
sum
to add up the values weekly and repeat the values:
df['TwoWeekSum'] = df.groupby(df['Date'].dt.week)['Amount'].transform('sum')
And then:
print(df)
Gives:
ID Date Amount TwoWeekSum
0 A 2021-08-03 100 320
1 A 2021-08-04 100 320
2 A 2021-08-06 20 320
3 A 2021-08-07 100 320
4 A 2021-08-09 300 830
5 A 2021-08-11 100 830
6 A 2021-08-12 100 830
7 A 2021-08-13 10 830
8 A 2021-08-23 10 40
9 A 2021-08-24 10 40
10 A 2021-08-26 10 40
11 A 2021-08-28 10 40
Use:
#convert values to datetimes
df['Date'] = pd.to_datetime(df['Date'])
#convert values to weeks
df['week'] = df['Date'].dt.isocalendar().week
#aggregate sum per ID and weeks, then add missing weeks and sum in rolling
f = lambda x: x.reindex(range(x.index.min(), x.index.max() + 1))
.rolling(2, min_periods=1).sum()
df1 = df.groupby(['ID', 'week'])['Amount'].sum().reset_index(level=0).groupby('ID').apply(f)
print (df1)
Amount
ID week
A 31 320.0
32 830.0
33 510.0
34 40.0
#last add to original DataFrame per ID and weeks
df=df.join(df1.rename(columns={'Amount':'TwoWeekSum'}),on=['ID','week']).drop('week',axis=1)
print (df)
ID Date Amount TwoWeekSum
0 A 2021-08-03 100 320.0
1 A 2021-08-04 100 320.0
2 A 2021-08-06 20 320.0
3 A 2021-08-07 100 320.0
4 A 2021-08-09 300 830.0
5 A 2021-08-11 100 830.0
6 A 2021-08-12 100 830.0
7 A 2021-08-13 10 830.0
8 A 2021-08-23 10 40.0
9 A 2021-08-24 10 40.0
10 A 2021-08-26 10 40.0
11 A 2021-08-28 10 40.0
per = pd.period_range(df['Date'].min(), df['Date'].max(), freq='w')
mapper = df.groupby(df['Date'].astype('Period[W]')).sum().reindex(per, fill_value=0).rolling(2, 1).sum()['Amount']
out = df['Date'].astype('Period[W]').map(mapper)
out
0 320.0
1 320.0
2 320.0
3 320.0
4 830.0
5 830.0
6 830.0
7 830.0
8 40.0
9 40.0
10 40.0
11 40.0
Name: Date, dtype: float64
make out
to TwoWeekSum column
df.assign(TwoWeekSum=out)
ID Date Amount TwoWeekSum
0 A 2021-08-03 100 320.0
1 A 2021-08-04 100 320.0
2 A 2021-08-06 20 320.0
3 A 2021-08-07 100 320.0
4 A 2021-08-09 300 830.0
5 A 2021-08-11 100 830.0
6 A 2021-08-12 100 830.0
7 A 2021-08-13 10 830.0
8 A 2021-08-23 10 40.0
9 A 2021-08-24 10 40.0
10 A 2021-08-26 10 40.0
11 A 2021-08-28 10 40.0
Update
if each ID
, groupby and merge
per = pd.period_range(df['Date'].min(), df['Date'].max(), freq='w')
s = df['Date'].astype('Period[W]')
idx = pd.MultiIndex.from_product([df['ID'].unique(), per])
df1 = df.groupby(['ID', s]).sum().reindex(idx, fill_value=0).rolling(2, 1).agg(sum).reset_index().set_axis(['ID', 'period', 'TwoWeekSum'], axis=1)
df.assign(period=s).merge(df1, how='left').drop('period', axis=1)