How to take confidence interval of statsmodels.tsa.holtwinters-ExponentialSmoothing Models in python?
Question:
I did time series forecasting analysis with ExponentialSmoothing in python. I used statsmodels.tsa.holtwinters.
model = ExponentialSmoothing(df, seasonal='mul', seasonal_periods=12).fit()
pred = model.predict(start=df.index[0], end=122)
plt.plot(df_fc.index, df_fc, label='Train')
plt.plot(pred.index, pred, label='Holt-Winters')
plt.legend(loc='best')
I want to take confidence interval of the model result. But I couldn’t find any function about this in "statsmodels.tsa.holtwinters – ExponentialSmoothing". How to I do that?
Answers:
From this answer from a GitHub issue, it is clear that you should be using the new ETSModel
class, and not the old (but still present for compatibility) ExponentialSmoothing
.
ETSModel
includes more parameters and more functionality than ExponentialSmoothing
.
To calculate confidence intervals, I suggest you to use the simulate
method of ETSResults
:
from statsmodels.tsa.exponential_smoothing.ets import ETSModel
import pandas as pd
# Build model.
ets_model = ETSModel(
endog=y, # y should be a pd.Series
seasonal='mul',
seasonal_periods=12,
)
ets_result = ets_model.fit()
# Simulate predictions.
n_steps_prediction = y.shape[0]
n_repetitions = 500
df_simul = ets_result.simulate(
nsimulations=n_steps_prediction,
repetitions=n_repetitions,
anchor='start',
)
# Calculate confidence intervals.
upper_ci = df_simul.quantile(q=0.9, axis='columns')
lower_ci = df_simul.quantile(q=0.1, axis='columns')
Basically, calling the simulate
method you get a DataFrame with n_repetitions
columns, and with n_steps_prediction
steps (in this case, the same number of items in your training data-set y
).
Then, you calculate the confidence intervals with DataFrame quantile
method (remember the axis='columns'
option).
You could also calculate other statistics from the df_simul
.
I also checked the source code: simulate
is internally called by the forecast
method to predict steps in the future. So, you could also predict steps in the future and their confidence intervals with the same approach: just use anchor='end'
, so that the simulations will start from the last step in y
.
To be fair, there is also a more direct approach to calculate the confidence intervals: the get_prediction
method (which uses simulate
internally). But I do not really like its interface, it is not flexible enough for me, I did not find a way to specify the desired confidence intervals. The approach with the simulate
method is pretty easy to understand, and very flexible, in my opinion.
If you want further details on how this kind of simulations are performed, read this chapter from the excellent Forecasting: Principles and Practice online book.
Complementing the answer from @Enrico, we can use the get_prediction
in the following way:
ci = model.get_prediction(start = forecast_data.index[0], end = forecast_data.index[-1])
preds = ci.pred_int(alpha = .05) #confidence interval
limits = ci.predicted_mean
preds = pd.concat([limits, preds], axis = 1)
preds.columns = ['yhat', 'yhat_lower', 'yhat_upper']
preds
Implemented answer (by myself)…. @Enrico, we can use the get_prediction in the following way:
from statsmodels.tsa.exponential_smoothing.ets import ETSModel
#---sales:pd.series, time series data(index should be timedate format)
#---new advanced holt's winter ts model implementation
HWTES_Model = ETSModel(endog=sales, trend= 'mul', seasonal='mul', seasonal_periods=4).fit()
point_forecast = HWTES_Model.forecast(16)
#-------Confidence Interval forecast calculation start------------------
ci = HWTES_Model.get_prediction(start = point_forecast.index[0],
end = point_forecast.index[-1])
lower_conf_forecast = ci.pred_int(alpha=alpha_1).iloc[:,0]
upper_conf_forecast = ci.pred_int(alpha=alpha_1).iloc[:,1]
#-------Confidence Interval forecast calculation end-----------------
To complement the previous answers, I provide the function to plot the CI on top of the forecast.
def ets_forecast(model, h=8):
# Simulate predictions.
n_steps_prediction =h
n_repetitions = 1000
yhat = model.forecast(h)
df_simul = model.simulate(
nsimulations=n_steps_prediction,
repetitions=n_repetitions,
anchor='end',
)
# Calculate confidence intervals.
upper_ci = df_simul.quantile(q=0.975, axis='columns')
lower_ci = df_simul.quantile(q=0.025, axis='columns')
plt.plot(yhat.index, yhat.values)
plt.fill_between(yhat.index, (lower_ci), (upper_ci), color='blue', alpha=0.1)
return yhat
plt.plot(y)
ets_forecast(model2, h=8)
plt.show()
I did time series forecasting analysis with ExponentialSmoothing in python. I used statsmodels.tsa.holtwinters.
model = ExponentialSmoothing(df, seasonal='mul', seasonal_periods=12).fit()
pred = model.predict(start=df.index[0], end=122)
plt.plot(df_fc.index, df_fc, label='Train')
plt.plot(pred.index, pred, label='Holt-Winters')
plt.legend(loc='best')
I want to take confidence interval of the model result. But I couldn’t find any function about this in "statsmodels.tsa.holtwinters – ExponentialSmoothing". How to I do that?
From this answer from a GitHub issue, it is clear that you should be using the new ETSModel
class, and not the old (but still present for compatibility) ExponentialSmoothing
.
ETSModel
includes more parameters and more functionality than ExponentialSmoothing
.
To calculate confidence intervals, I suggest you to use the simulate
method of ETSResults
:
from statsmodels.tsa.exponential_smoothing.ets import ETSModel
import pandas as pd
# Build model.
ets_model = ETSModel(
endog=y, # y should be a pd.Series
seasonal='mul',
seasonal_periods=12,
)
ets_result = ets_model.fit()
# Simulate predictions.
n_steps_prediction = y.shape[0]
n_repetitions = 500
df_simul = ets_result.simulate(
nsimulations=n_steps_prediction,
repetitions=n_repetitions,
anchor='start',
)
# Calculate confidence intervals.
upper_ci = df_simul.quantile(q=0.9, axis='columns')
lower_ci = df_simul.quantile(q=0.1, axis='columns')
Basically, calling the simulate
method you get a DataFrame with n_repetitions
columns, and with n_steps_prediction
steps (in this case, the same number of items in your training data-set y
).
Then, you calculate the confidence intervals with DataFrame quantile
method (remember the axis='columns'
option).
You could also calculate other statistics from the df_simul
.
I also checked the source code: simulate
is internally called by the forecast
method to predict steps in the future. So, you could also predict steps in the future and their confidence intervals with the same approach: just use anchor='end'
, so that the simulations will start from the last step in y
.
To be fair, there is also a more direct approach to calculate the confidence intervals: the get_prediction
method (which uses simulate
internally). But I do not really like its interface, it is not flexible enough for me, I did not find a way to specify the desired confidence intervals. The approach with the simulate
method is pretty easy to understand, and very flexible, in my opinion.
If you want further details on how this kind of simulations are performed, read this chapter from the excellent Forecasting: Principles and Practice online book.
Complementing the answer from @Enrico, we can use the get_prediction
in the following way:
ci = model.get_prediction(start = forecast_data.index[0], end = forecast_data.index[-1])
preds = ci.pred_int(alpha = .05) #confidence interval
limits = ci.predicted_mean
preds = pd.concat([limits, preds], axis = 1)
preds.columns = ['yhat', 'yhat_lower', 'yhat_upper']
preds
Implemented answer (by myself)…. @Enrico, we can use the get_prediction in the following way:
from statsmodels.tsa.exponential_smoothing.ets import ETSModel
#---sales:pd.series, time series data(index should be timedate format)
#---new advanced holt's winter ts model implementation
HWTES_Model = ETSModel(endog=sales, trend= 'mul', seasonal='mul', seasonal_periods=4).fit()
point_forecast = HWTES_Model.forecast(16)
#-------Confidence Interval forecast calculation start------------------
ci = HWTES_Model.get_prediction(start = point_forecast.index[0],
end = point_forecast.index[-1])
lower_conf_forecast = ci.pred_int(alpha=alpha_1).iloc[:,0]
upper_conf_forecast = ci.pred_int(alpha=alpha_1).iloc[:,1]
#-------Confidence Interval forecast calculation end-----------------
To complement the previous answers, I provide the function to plot the CI on top of the forecast.
def ets_forecast(model, h=8):
# Simulate predictions.
n_steps_prediction =h
n_repetitions = 1000
yhat = model.forecast(h)
df_simul = model.simulate(
nsimulations=n_steps_prediction,
repetitions=n_repetitions,
anchor='end',
)
# Calculate confidence intervals.
upper_ci = df_simul.quantile(q=0.975, axis='columns')
lower_ci = df_simul.quantile(q=0.025, axis='columns')
plt.plot(yhat.index, yhat.values)
plt.fill_between(yhat.index, (lower_ci), (upper_ci), color='blue', alpha=0.1)
return yhat
plt.plot(y)
ets_forecast(model2, h=8)
plt.show()