Rolling window and problem with slice indexing

Question:

I use this Python code to calculate CQ statistic for each year in my dataset:

import pandas as pd
import numpy as np
import time
from CrossQuantilogram import Bootstrap
import CrossQuantilogram
from CrossQuantilogram import LjungBoxQ

d1=pd.read_csv(r"...sgold.csv")
d2=pd.read_csv(r"...cgold.csv")
def CQBS_years(d1,a1,d2,a2,k=1,window=1,cqcl=0.95,testf=LjungBoxQ,testcl=0.95,
                all=False,n=1000,verbose=True):     
       
    startyear,endyear = 2010, 2019
    if window>1+endyear-startyear:
        raise ValueError("length of window must <= data range")

    cqres,yearlist=[],[(str(x),str(x+window-1)) for x in range(startyear,endyear-window+2)]    
    for start,end in yearlist:
        if verbose:
            print("Processing {}/{}   ".format(end,endyear),end='r')
        cqres.append(CQBS(data1[start:end],a1,data2[start:end],a2,k,cqcl,testf,testcl,n,False))

    res,yearindex=[],[str(x) for x in range(startyear+window-1,endyear+1)]
    if all:
        for i in [[df.iloc[x] for df in cqres] for x in range(k)]:
            merged = pd.concat(i,ignore_index=True)
            merged.index = yearindex
            res.append(merged)        
    else:
        res=pd.concat(cqres,ignore_index=True)
        res.index = yearindex
    if verbose:
        print("Bootstraping CQ done      ")
    return res
%%time
CrossQuantilogram.CQBS_years(d1["day"],0.1,d2["day"],0.1,k=1,window=1,cqcl=0.95,testcl=0.95,all=False,n=1000,verbose=True)

While estimating the CQBS_years function, I get this error: "cannot do slice indexing on RangeIndex with these indexers [2010] of type str". I know this is related to the string type of date in my CSV files. But I don’t know how to solve it.

The dataset is available at this link: https://drive.google.com/drive/folders/1PXyXP3AK8_KYxRYfZWO3VHzPPueG3FEF?usp=sharing Here is the source of the code: https://github.com/wangys96/Cross-Quantilogram Any help is greatly appreciated.

Asked By: AMIR

||

Answers:

The problem is that selecting data like this in pandas – df['2010':'2010'], requires the dataframe df to be indexed by dates.

Thus, you need to read the data, parse the column with dates as datetime, and set the index to that column. This can be achieved in one step:

d1=pd.read_csv(r"...sgold.csv", parse_dates=[0], index_col=[0])
d2=pd.read_csv(r"...cgold.cs", parse_dates=[0], index_col=[0])
Answered By: qaziqarta
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.