How to calculate percent change of each month, from average of previous months
Question:
I have data that looks like this:
YearMonth PageViews Users
202001 100 10
202002 150 12
202003 100 13
202004 120 15
202005 130 10
I want to find out the percentage difference of each new month from the average of the previous month’s usage. For example,
- For 202001 it will be NA as there is no previous month
- For 202002 the average of the previous month
will be only 202001, so the percentage difference for pageviews for
202002 will be 50% and user would be 20% as compared to 202001
- Similarly for 202003, the average of previous months would be
average of 202001 and 202002 combined, so pageview average is 125 (100+150 divided by 2), and user
is 11, so the percentage difference for 202003 will be -20% for
pageviews and +18% for users and so on.
How can I find this using python? Any help would be appreciated. Thank you.
Answers:
Try cumsum()/range
for the cumulative mean:
cols = ['PageViews','Users']
cum_mean = df[cols].cumsum().div(np.arange(len(df))+1, axis=0)
df[cols].div(cum_mean.shift()).sub(1)
Output:
PageViews Users
0 NaN NaN
1 0.500000 0.200000
2 -0.200000 0.181818
3 0.028571 0.285714
4 0.106383 -0.200000
Alternative using df.expanding().mean()
((df[['PageViews','Users']]
.div(df[['PageViews','Users']].expanding().mean().shift()))
.sub(1))
Output:
PageViews Users
0 NaN NaN
1 0.500000 0.200000
2 -0.200000 0.181818
3 0.028571 0.285714
4 0.106383 -0.200000
I have data that looks like this:
YearMonth PageViews Users
202001 100 10
202002 150 12
202003 100 13
202004 120 15
202005 130 10
I want to find out the percentage difference of each new month from the average of the previous month’s usage. For example,
- For 202001 it will be NA as there is no previous month
- For 202002 the average of the previous month
will be only 202001, so the percentage difference for pageviews for
202002 will be 50% and user would be 20% as compared to 202001 - Similarly for 202003, the average of previous months would be
average of 202001 and 202002 combined, so pageview average is 125 (100+150 divided by 2), and user
is 11, so the percentage difference for 202003 will be -20% for
pageviews and +18% for users and so on.
How can I find this using python? Any help would be appreciated. Thank you.
Try cumsum()/range
for the cumulative mean:
cols = ['PageViews','Users']
cum_mean = df[cols].cumsum().div(np.arange(len(df))+1, axis=0)
df[cols].div(cum_mean.shift()).sub(1)
Output:
PageViews Users
0 NaN NaN
1 0.500000 0.200000
2 -0.200000 0.181818
3 0.028571 0.285714
4 0.106383 -0.200000
Alternative using df.expanding().mean()
((df[['PageViews','Users']]
.div(df[['PageViews','Users']].expanding().mean().shift()))
.sub(1))
Output:
PageViews Users
0 NaN NaN
1 0.500000 0.200000
2 -0.200000 0.181818
3 0.028571 0.285714
4 0.106383 -0.200000