Vectorized calculation of new timeseries in pandas dataframe

Question:

I have a pandas dataframe and I am trying to estimate a new timeseries V(t) based on the values of an existing timeseries B(t). I have written a minimal reproducible example to generate a sample dataframe as follows:

import pandas as pd
import numpy as np

lenb = 5000
lenv = 200
l    = 5

B = pd.DataFrame({'a': np.arange(0, lenb, 1), 'b': np.arange(0, lenb, 1)},
                 index=pd.date_range('2022-01-01', periods=lenb, freq='2s'))

I want to calculate V(t) for all times ‘t’ in the timeseries B as:

V(t) = (B(t-2*l) + 4*B(t-l)+ 6*B(t)+ 4*B(t+l)+ 1*B(t+2*l))/16

How can I perform this calculation in a vectorized manner in pandas? Lets say that l=5

Would that be the correct way to do it:

def V_t(B, l):
    V = (B.shift(-2*l) + 4*B.shift(-l) + 6*B + 4*B.shift(l) + B.shift(2*l)) / 16
    return V
Asked By: Jokerp

||

Answers:

I would have done it as you suggested in your latest edit. So here is an alternative to avoid having to type all the shift commands for an arbitrary long list of factors/multipliers:

import numpy as np

def V_t(B, l):
    X = [1, 4, 6, 4, 4]
    Y = [-2*l, -l, 0, l, 2*l]
    return pd.DataFrame(np.add.reduce([x*B.shift(y) for x, y in zip(X, Y)])/16,
                        index=B.index, columns=B.columns)
Answered By: mozway
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.