Calculate the sum of a pandas column depending on the change of values in another column

Question

I have a dataframe as follows:

df = 

     col_1   val_1
0    4.0     0.89
1    4.0     0.56
2    49.0    0.7
3    49.0    1.23
4    49.0    0.8
5    52.0    0.5
6    52.0    0.2

I want to calculate the sum of the column val_1 with a penalising factor which depends on the change in the values of col_1.

For example: If there is a change in the value in col_1, then we take the value from previous row in val_1 and subtract with a penalising factor of 0.4

sum = 0.89 + (0.56-0.4) (because there is change of value in col_1 from 4.0 to 49.0) +0.7 +1.23 + (0.8 – 0.4) (because there is a change of value in col_1 from 49.0 to 52.0) + 0.5 + 0.2

sum = 4.08

Is there a way to do this?

Asked By: Eng123

||

Source

Answer 1

use np.where to assign a new column and measure changes with .shift() against each row.

import numpy as np

df['val_1_adj'] = np.where(df['col_1'].ne(df['col_1'].shift(-1).ffill()),
                         df['val_1'].sub(0.4), 
                         df['val_1'])

print(df)

   col_1  val_1  val_1_adj
0    4.0   0.89       0.89
1    4.0   0.56       0.16
2   49.0   0.70       0.70
3   49.0   1.23       1.23
4   49.0   0.80       0.40
5   52.0   0.50       0.50
6   52.0   0.20       0.20

df['val_1_adj'].sum()
 4.08

Answered By: Umar.H

Answer 2

Slight variation on @UmarH’s answer

df['penalties'] = np.where(~df.col_1.diff(-1).isin([0, np.nan]), 0.4, 0)
my_sum = (df['val_1'] - df['penalties']).sum()
print(my_sum)

Output:

4.08

Answered By: Bill

Calculate the sum of a pandas column depending on the change of values in another column

Question:

Answers: