Calculating Carry over effect

Question:

I want to calculate the carry-over effect of TV advertising GRP data.
My input data looks like below:

      Variable       Date  Causal  Half_Life
0     TV Model 2016-01-10       0          4
1     TV Model 2016-01-17       0          4
2     TV Model 2016-01-24       0          4
3     TV Model 2016-01-31     100          4
4     TV Model 2016-02-07     110          4
5     TV Model 2016-02-14      89          4
6     TV Model 2016-02-21      57          4
7     TV Model 2016-02-28      90          4
8   TV General 2016-01-10       0          4
9   TV General 2016-01-17       0          4
10  TV General 2016-01-24       0          4
11  TV General 2016-01-31      30          4
12  TV General 2016-02-07      32          4
13  TV General 2016-02-14      42          4
14  TV General 2016-02-21      39          4
15  TV General 2016-02-28      55          4

I want to calculate a new column df[‘Adstock’] based on the below condition:

If first row of the group from Column df.Variable, then df.Adstock = df.Causal
If not the first row from the group then, df. Adstock = df.Causal + 0.5**(1/df.Half_life)*df.Adstock from the previous row.

I am using the below code:

import pandas as pd
import numpy as np
import numpy.random as random
import statsmodels.api as sm
import statsmodels.tsa as tsa
import statsmodels.formula.api as smf
import datetime

df = pd.read_excel('RC Data.xlsx')


df['Adstock'] = 0

df['Adstock'] = np.where(df['Variable'] == df['Variable'].shift(1), df['Adstock'].shift(1)*(0.5**(1/df['Half_Life'])) + df['Causal'], df['Causal'])

The output I get is as below:

      Variable       Date  Causal  Half_Life  Adstock
0     TV Model 2016-01-10       0          4      0.0
1     TV Model 2016-01-17       0          4      0.0
2     TV Model 2016-01-24       0          4      0.0
3     TV Model 2016-01-31     100          4    100.0
4     TV Model 2016-02-07     110          4    110.0
5     TV Model 2016-02-14      89          4     89.0
6     TV Model 2016-02-21      57          4     57.0
7     TV Model 2016-02-28      90          4     90.0
8   TV General 2016-01-10       0          4      0.0
9   TV General 2016-01-17       0          4      0.0
10  TV General 2016-01-24       0          4      0.0
11  TV General 2016-01-31      30          4     30.0
12  TV General 2016-02-07      32          4     32.0
13  TV General 2016-02-14      42          4     42.0
14  TV General 2016-02-21      39          4     39.0
15  TV General 2016-02-28      55          4     55.0

But the required output should look like this:

      Variable       Date  Causal  Half_Life     Adstock
0     TV Model 2016-01-10       0          4    0.000000
1     TV Model 2016-01-17       0          4    0.000000
2     TV Model 2016-01-24       0          4    0.000000
3     TV Model 2016-01-31     100          4  100.000000
4     TV Model 2016-02-07     110          4  194.089642
5     TV Model 2016-02-14      89          4  252.209284
6     TV Model 2016-02-21      57          4  269.081883
7     TV Model 2016-02-28      90          4  316.269991
8   TV General 2016-01-10       0          4    0.000000
9   TV General 2016-01-17       0          4    0.000000
10  TV General 2016-01-24       0          4    0.000000
11  TV General 2016-01-31      30          4   30.000000
12  TV General 2016-02-07      32          4   57.226892
13  TV General 2016-02-14      42          4   90.121889
14  TV General 2016-02-21      39          4  114.783173
15  TV General 2016-02-28      55          4  151.520759

Please help.

Asked By: Chirayu05

||

Answers:

Here is my solution , I think there is hard to make it vectorized

l=[]
for x , y in df.groupby('Variable',sort=False):
    #print(y)
    l1=[]
    for s,t in y.iterrows():
        if len(l1)==0:
            l1.append(t['Causal'])
        else:
            l1.append(t['Causal'] + 0.5**(1/t['Half_Life'])*l1[-1])
    l.extend(l1)
df['New']=l
df
Out[982]: 
     Variable        Date  Causal  Half_Life         New
0     TVModel  2016-01-10       0          4    0.000000
1     TVModel  2016-01-17       0          4    0.000000
2     TVModel  2016-01-24       0          4    0.000000
3     TVModel  2016-01-31     100          4  100.000000
4     TVModel  2016-02-07     110          4  194.089642
5     TVModel  2016-02-14      89          4  252.209284
6     TVModel  2016-02-21      57          4  269.081883
7     TVModel  2016-02-28      90          4  316.269991
8   TVGeneral  2016-01-10       0          4    0.000000
9   TVGeneral  2016-01-17       0          4    0.000000
10  TVGeneral  2016-01-24       0          4    0.000000
11  TVGeneral  2016-01-31      30          4   30.000000
12  TVGeneral  2016-02-07      32          4   57.226892
13  TVGeneral  2016-02-14      42          4   90.121889
14  TVGeneral  2016-02-21      39          4  114.783173
15  TVGeneral  2016-02-28      55          4  151.520759
Answered By: BENY
def decay(df, row_id):
    causal_value=df._get_value(row_id,'Causal')
    half_life = df._get_value(row_id, "Half_Life")
    ad_stock_value = df._get_value(row_id - 1, "adstock_value")
    val = causal_value+0.5 ** (1 / half_life) * ad_stock_value
    return val


def adstock(df):
    #adding new col "adstock_value"
    df.loc[:, 'adstock_value'] = np.nan
    visited = set()

    for i in range(0, len(df)):

        var = df._get_value(i, "Variable")
        if var in visited:
            df.loc[i, "adstock_value"] = decay(df, i)
        else:
            visited.add(var)
            df.loc[i, "adstock_value"] = df._get_value(i, "Causal")

        #print(df.iloc[i])

adstock(df)
Out[982]: 
 Variable        Date  Causal  Half_Life         New

0 TVModel 2016-01-10 0 4 0.000000
1 TVModel 2016-01-17 0 4 0.000000
2 TVModel 2016-01-24 0 4 0.000000
3 TVModel 2016-01-31 100 4 100.000000
4 TVModel 2016-02-07 110 4 194.089642
5 TVModel 2016-02-14 89 4 252.209284
6 TVModel 2016-02-21 57 4 269.081883
7 TVModel 2016-02-28 90 4 316.269991
8 TVGeneral 2016-01-10 0 4 0.000000
9 TVGeneral 2016-01-17 0 4 0.000000
10 TVGeneral 2016-01-24 0 4 0.000000
11 TVGeneral 2016-01-31 30 4 30.000000
12 TVGeneral 2016-02-07 32 4 57.226892
13 TVGeneral 2016-02-14 42 4 90.121889
14 TVGeneral 2016-02-21 39 4 114.783173
15 TVGeneral 2016-02-28 55 4 151.520759

Answered By: Sarbani Dasgupta
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.