Rolling sum of dataframe column with custom function gets really slow on large averaging windows. Can I fix with convolve?


I am using the following function to estimate the Gaussian window rolling average of my timeseries. Though it works great from small size averaging windows, it crushes (or gets extremely slow) for larger averaging windows.

def norm_factor_Gauss_window(s, dt):
    numer         = np.arange(-3*s, 3*s+dt, dt)
    multiplic_fac = np.exp(-(numer)**2/(2*s**2))
    norm_factor   = np.sum(multiplic_fac)
    window        = len(multiplic_fac)
    return window,  multiplic_fac, norm_factor

# Create dataframe for MRE
aa = np.sin(np.linspace(0,2*np.pi,1000000))+0.15*np.random.rand(1000000)
df = pd.DataFrame({'x':aa})

hmany  = 10
dt     = 1      # ['seconds']
s      = hmany*dt  # Define averaging window size ['s']

# Estimate multip factor, normalizatoon factor etc
window, multiplic_fac, norm_factor= norm_factor_Gauss_window(s, dt)

# averaged timeseries
res2 =(1/norm_factor)*df.x.rolling(window, center=True).apply(lambda x: (x * multiplic_fac).sum(), raw=True, engine='numba', engine_kwargs= {'nopython': True,  'parallel': True} , args=None, kwargs=None)


I am aware that people usually speed up moving average operations using convolve(e.g., How to calculate rolling / moving average using python + NumPy / SciPy?)

Would it be possible to use convolve here somehow to fix this issue? Also, are there any other suggestion that would help me speed up the operation for large averaging windows?

Asked By: Jokerp



Using numba njit decorator on norm_factor_Gauss_window function on my pc I get a 10x speed up (from 10µs to 1µs) on the execution time of this function.

import numba as nb

def norm_factor_Gauss_window(s, dt):
    numer         = np.arange(-3*s, 3*s+dt, dt)
    multiplic_fac = np.exp(-(numer)**2/(2*s**2))
    norm_factor   = np.sum(multiplic_fac)
    window        = len(multiplic_fac)
    return window,  multiplic_fac, norm_factor

This is not a big improvement seen on the total execution time which depends heavily on rolling mean on my pc 900ms. With some adjustments I was able to get to 650ms (-25% execution time) by removing the keyword 'parallel', as in this case there is nothing that can be parallelized with this approach, as evidenced by the warning 'NumbaPerformanceWarning'. I also removed the other keywords, as they are the default values.

df.x.rolling(window, center=True).apply(lambda x: (x * multiplic_fac).sum(), 
                                        raw=True, engine='numba')
Answered By: Massifox

I was able to drastically improve the speed of this code using the following:

from scipy import signal
def norm_factor_Gauss_window(s, dt):
    numer         = np.arange(-3*s, 3*s+dt, dt)
    multiplic_fac = np.exp(-(numer)**2/(2*s**2))
    norm_factor   = np.sum(multiplic_fac)
    window        = len(multiplic_fac)
    return window,  multiplic_fac, norm_factor

# Create dataframe for MRE
aa = np.sin(np.linspace(0,2*np.pi,1000000))+0.15*np.random.rand(1000000)
df = pd.DataFrame({'x':aa})

hmany  = 10
dt     = 1      # ['seconds']
s      = hmany*dt  # Define averaging window size ['s']

# Estimate multip factor, normalizatoon factor etc
window, multiplic_fac, norm_factor= norm_factor_Gauss_window(s, dt)

# averaged timeseries

res2 = (1/norm_factor)*signal.fftconvolve(df.x.values, multiplic_fac[::-1], 'same')

Answered By: Jokerp