Subtract two columns and show outcome in the next line

Question:

I would like to get a cumulative sum of X and also subtract X-Y in each column. The remainder should be found and added to the next row. I would also like the count to reset to 0 at the end of every month. For example, the 1st of every month should be a 0 in X.

For example:

DF1:

Date X Y
2021-04-25 4 4
2021-04-26 0 0
2021-04-27 0 0
2021-04-28 56 53
2021-04-29 0 0
2021-04-30 1 0
2021-05-01 0 0
2021-05-02 5 0
2021-05-03 5 5
2021-05-04 0 0

Dfdesired:

Date X Y
2021-04-25 4 4
2021-04-26 0 0
2021-04-27 0 0
2021-04-28 56 53
2021-04-29 3 0
2021-04-30 4 0
2021-05-01 0 0
2021-05-02 5 0
2021-05-03 10 5
2021-05-04 5 0

I have tried this for the cumulative sum but it does not seem to be working and I am unsure how to reset to 0 at the end of the month.

df1['X'] += df1['X'] - df1['Invoice Rejected']
Asked By: John Tayson

||

Answers:

Looks like you want:

df['X'] = df['X'] + (df['X'] - df['Y']).cumsum().shift(1).fillna(0)

Which yields:

          Date     X   Y
0  2021-04-25    4.0   4
1  2021-04-26    0.0   0
2  2021-04-27    0.0   0
3  2021-04-28   56.0  53
4  2021-04-29    3.0   0
5  2021-04-30    4.0   0
6  2021-05-01    4.0   0
7  2021-05-02    9.0   0
8  2021-05-03   14.0   5
9  2021-05-04    9.0   0
Answered By: fishmulch

Try this to apply the calculation on each month:

df["X"] = (df["X"] + 
           df.eval("Diff = X - Y").groupby(pd.Grouper(key="Date", freq="M"))["Diff"]
             .cumsum().shift().fillna(0))

Output:

        Date     X   Y
0 2021-04-25   4.0   4
1 2021-04-26   0.0   0
2 2021-04-27   0.0   0
3 2021-04-28  56.0  53
4 2021-04-29   3.0   0
5 2021-04-30   4.0   0
6 2021-05-01   4.0   0
7 2021-05-02   5.0   0
8 2021-05-03  10.0   5
9 2021-05-04   5.0   0

Details:

  1. Create a "Diff" column temporary
  2. Use pd.Grouper to group on "Date" with a freq="M" and cumsum
  3. Shift and fillna
  4. Add back to "X"
Answered By: Scott Boston