How to perform sliding window correlation operation on pandas dataframe with datetime index?

Question:

I am working with stock data coming from Yahoo Finance.

def load_y_finance_data(y_finance_tickers: list):
    df = pd.DataFrame()
    print("Loading Y-Finance data ...")
    for ticker in y_finance_tickers:
        df[ticker.replace("^", "")] = yf.download(
            ticker,
            auto_adjust=True,  # only download adjusted data
            progress=False,
        )["Close"]
    print("Done loading Y-Finance data!")
    return df

x = load_y_finance_data(["^VIX", "^GSPC"])
x
            VIX         GSPC
Date        
1990-01-02  17.240000   359.690002
1990-01-03  18.190001   358.760010
1990-01-04  19.219999   355.670013
1990-01-05  20.110001   352.200012
1990-01-08  20.260000   353.790009
DataSize=(8301, 2)

Here I want to perform a sliding window operation for every 50 days period, where I want to get correlation (using corr() function) for 50 days slice (day_1 to day_50) of data and after window will move by one day (day_2 to day_51) and so on.

I tried the naive way of using a for loop to do this and it works as well. But it takes too much time. Code below-

data_size = len(x)
period = 50

df = pd.DataFrame()

for i in range(data_size-period):
    df.loc[i, "GSPC_VIX_corr"] = x[["GSPC", "VIX"]][i:i+period].corr().loc["GSPC", "VIX"]

df
    GSPC_VIX_corr
0   -0.703156
1   -0.651513
2   -0.602876
3   -0.583256
4   -0.589086

How can I do this more efficiently? Is there any built-in way I can use?
Thanks 🙂

Asked By: Pratik Ghodke

||

Answers:

You can use the rolling windows functionality of Pandas with many different aggreggations, including corr(). Instead of your for loop, do this:

x["VIX"].rolling(window=period).corr(x["GSPC"])
Answered By: Johannes Schöck
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.