Compute cumulative sum of negative and positive sequences

Question:

Here is my dataframe:

import pandas as pd
my_df = pd.DataFrame({'col_1': [1,2,5,6,4,3,7,8,9,11,14]})
my_df['diff'] = my_df['col_1'].diff()
my_df['cond_sum'] = [None, 1,4,5,-2,-3,4,5,6,8,11]

My original column is col_1. My goal is to obtain column cond_sum. diff column is difference between consecutive rows. cond_sum is cumulative sum of diff column which resets every time there is change in sign.

I can clearly write a function and do it on row by row basis, but I would like something faster.

Asked By: user1700890

||

Answers:

Get the sign of your diff, then create a groupby statement and calculate the cumsum

import numpy as np
import pandas as pd
my_df = pd.DataFrame({'col_1': [1,2,5,6,4,3,7,8,9,11,14]})
my_df['diff'] = my_df['col_1'].diff()

sign = np.sign(my_df['diff'])
groupby_ = (sign != sign.shift(1)).cumsum()
my_df['cond_sum'] = my_df.groupby(groupby_)['diff'].cumsum()
Answered By: bitflip
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.