Fourier series columns don't appear in Deterministicprocess()

Question:

I have been refreshing my time-series skills and I’m having trouble with creating Fourier series. Here is the data (if you run everything together it will give you the same plots and final table):

import pandas as pd
import matplotlib.pyplot as plt
from statsmodels.tsa.deterministic import CalendarFourier, DeterministicProcess
from sklearn.linear_model import LinearRegression

df = pd.DataFrame({'Pax': {Period('1949-01', 'M'): 112,  Period('1949-02', 'M'): 118,  Period('1949-03', 'M'): 132,  Period('1949-04', 'M'): 129,  Period('1949-05', 'M'): 121,  Period('1949-06', 'M'): 135,  Period('1949-07', 'M'): 148,  Period('1949-08', 'M'): 148,  Period('1949-09', 'M'): 136,  Period('1949-10', 'M'): 119,  Period('1949-11', 'M'): 104,  Period('1949-12', 'M'): 118,  Period('1950-01', 'M'): 115,  Period('1950-02', 'M'): 126,  Period('1950-03', 'M'): 141,  Period('1950-04', 'M'): 135,  Period('1950-05', 'M'): 125,  Period('1950-06', 'M'): 149,  Period('1950-07', 'M'): 170,  Period('1950-08', 'M'): 170,  Period('1950-09', 'M'): 158,  Period('1950-10', 'M'): 133,  Period('1950-11', 'M'): 114,  Period('1950-12', 'M'): 140,  Period('1951-01', 'M'): 145,  Period('1951-02', 'M'): 150,  Period('1951-03', 'M'): 178,  Period('1951-04', 'M'): 163,  Period('1951-05', 'M'): 172,  Period('1951-06', 'M'): 178,  Period('1951-07', 'M'): 199,  Period('1951-08', 'M'): 199,  Period('1951-09', 'M'): 184,  Period('1951-10', 'M'): 162,  Period('1951-11', 'M'): 146,  Period('1951-12', 'M'): 166,  Period('1952-01', 'M'): 171,  Period('1952-02', 'M'): 180,  Period('1952-03', 'M'): 193,  Period('1952-04', 'M'): 181,  Period('1952-05', 'M'): 183,  Period('1952-06', 'M'): 218,  Period('1952-07', 'M'): 230,  Period('1952-08', 'M'): 242,  Period('1952-09', 'M'): 209,  Period('1952-10', 'M'): 191,  Period('1952-11', 'M'): 172,  Period('1952-12', 'M'): 194,  Period('1953-01', 'M'): 196,  Period('1953-02', 'M'): 196,  Period('1953-03', 'M'): 236,  Period('1953-04', 'M'): 235,  Period('1953-05', 'M'): 229,  Period('1953-06', 'M'): 243,  Period('1953-07', 'M'): 264,  Period('1953-08', 'M'): 272,  Period('1953-09', 'M'): 237,  Period('1953-10', 'M'): 211,  Period('1953-11', 'M'): 180,  Period('1953-12', 'M'): 201,  Period('1954-01', 'M'): 204,  Period('1954-02', 'M'): 188,  Period('1954-03', 'M'): 235,  Period('1954-04', 'M'): 227,  Period('1954-05', 'M'): 234,  Period('1954-06', 'M'): 264,  Period('1954-07', 'M'): 302,  Period('1954-08', 'M'): 293,  Period('1954-09', 'M'): 259,  Period('1954-10', 'M'): 229,  Period('1954-11', 'M'): 203,  Period('1954-12', 'M'): 229,  Period('1955-01', 'M'): 242,  Period('1955-02', 'M'): 233,  Period('1955-03', 'M'): 267,  Period('1955-04', 'M'): 269,  Period('1955-05', 'M'): 270,  Period('1955-06', 'M'): 315,  Period('1955-07', 'M'): 364,  Period('1955-08', 'M'): 347,  Period('1955-09', 'M'): 312,  Period('1955-10', 'M'): 274,  Period('1955-11', 'M'): 237,  Period('1955-12', 'M'): 278,  Period('1956-01', 'M'): 284,  Period('1956-02', 'M'): 277,  Period('1956-03', 'M'): 317,  Period('1956-04', 'M'): 313,  Period('1956-05', 'M'): 318,  Period('1956-06', 'M'): 374,  Period('1956-07', 'M'): 413,  Period('1956-08', 'M'): 405,  Period('1956-09', 'M'): 355,  Period('1956-10', 'M'): 306,  Period('1956-11', 'M'): 271,  Period('1956-12', 'M'): 306,  Period('1957-01', 'M'): 315,  Period('1957-02', 'M'): 301,  Period('1957-03', 'M'): 356,  Period('1957-04', 'M'): 348,  Period('1957-05', 'M'): 355,  Period('1957-06', 'M'): 422,  Period('1957-07', 'M'): 465,  Period('1957-08', 'M'): 467,  Period('1957-09', 'M'): 404,  Period('1957-10', 'M'): 347,  Period('1957-11', 'M'): 305,  Period('1957-12', 'M'): 336,  Period('1958-01', 'M'): 340,  Period('1958-02', 'M'): 318,  Period('1958-03', 'M'): 362,  Period('1958-04', 'M'): 348,  Period('1958-05', 'M'): 363,  Period('1958-06', 'M'): 435,  Period('1958-07', 'M'): 491,  Period('1958-08', 'M'): 505,  Period('1958-09', 'M'): 404,  Period('1958-10', 'M'): 359,  Period('1958-11', 'M'): 310,  Period('1958-12', 'M'): 337,  Period('1959-01', 'M'): 360,  Period('1959-02', 'M'): 342,  Period('1959-03', 'M'): 406,  Period('1959-04', 'M'): 396,  Period('1959-05', 'M'): 420,  Period('1959-06', 'M'): 472,  Period('1959-07', 'M'): 548,  Period('1959-08', 'M'): 559,  Period('1959-09', 'M'): 463,  Period('1959-10', 'M'): 407,  Period('1959-11', 'M'): 362,  Period('1959-12', 'M'): 405,  Period('1960-01', 'M'): 417,  Period('1960-02', 'M'): 391,  Period('1960-03', 'M'): 419,  Period('1960-04', 'M'): 461,  Period('1960-05', 'M'): 472,  Period('1960-06', 'M'): 535,  Period('1960-07', 'M'): 622,  Period('1960-08', 'M'): 606,  Period('1960-09', 'M'): 508,  Period('1960-10', 'M'): 461,Period('1960-11', 'M'): 390,Period('1960-12', 'M'): 432}})
df.head()

enter image description here

Where I create a constant and a trend:

dp = DeterministicProcess(
        index=df.index,
        constant=True,
        order=1,
        seasonal=False,
        #additional_terms=[fourier],
        drop=True,
    )

X = dp.in_sample()
y = df.squeeze()

Which I fit with a linear regression, detrend the time-series, and plot the results:

model_pax = LinearRegression().fit(X, y)
y_pred_pax = pd.Series(model_pax.predict(X), index=X.index)
y_detrended = y-y_pred_pax

fig, (ax1, ax2) = plt.subplots(2,1, sharex=True, figsize=(10, 4))

ax1 = y.plot(label='Pax', ax=ax1)
ax1 = y_pred_pax.plot(label='trend', ax=ax1)
ax1.legend()

ax2 = y_detrended.plot(label='Pax detrended', ax=ax2)
ax2.legend()
plt.show()

enter image description here

Now I want to capture the seasonality, for this I need to do a fourier series. However when I create the deterministic process and include the fourier series, the fourier series columns don’t appear.

fourier =  CalendarFourier(freq="M", order=4)
dp = DeterministicProcess(
        index=y_detrended.index,
        constant=True,
        order=0,
        seasonal=False,
        additional_terms=[fourier]
        drop=True,
    )

dp.in_sample().head()

enter image description here

Only appears the constant without the fourier columns. Why? I have tried this with other datasets and works perfectly, and I don’t see any difference here. What am I missing here?

Asked By: Chris

||

Answers:

I found the solution. I just had to change the M from

CalendarFourier(freq="M", order=4)

To Y:

CalendarFourier(freq="Y", order=4)

I can’t understand why or how it works specifically. It seems that the function CalendarFourier() deduces if the input of the index argument is compatible with the frequency we are giving to the function. But I can’t be sure of this. Hope that someone finds a better explanation.

Answered By: Chris

To further explain your own answer.

freq="M" means generating monthly fourier series, which means the series will repeat monthly.

freq="Y" means repeat yearly. So here, you clearly want to use the yearly repeat.

CalendarFourier(freq="M", order=4)
CalendarFourier(freq="Y", order=4)

Answered By: Y00