More about forecasting in cienciadedatos.net
- ARIMA and SARIMAX models with python
- Time series forecasting with machine learning
- Forecasting time series with gradient boosting: XGBoost, LightGBM and CatBoost
- Forecasting time series with XGBoost
- Global Forecasting Models: Multi-series forecasting
- Global Forecasting Models: Comparative Analysis of Single and Multi-Series Forecasting Modeling
- Probabilistic forecasting
- Forecasting with deep learning
- Forecasting energy demand with machine learning
- Forecasting web traffic with machine learning
- Intermittent demand forecasting
- Modelling time series trend with tree-based models
- Bitcoin price prediction with Python
- Stacking ensemble of machine learning models to improve forecasting
- Interpretable forecasting models
- Mitigating the Impact of Covid on forecasting Models
- Forecasting time series with missing values

Introduction
Machine learning interpretability, also known as explainability, refers to the ability to understand, interpret, and explain the decisions or predictions made by machine learning models in a human-understandable way. It aims to shed light on how a model arrives at a particular result or decision.
Due to the complex nature of many modern machine learning models, such as ensemble methods, they often function as black boxes, making it difficult to understand why a particular prediction was made. Explanability techniques aim to demystify these models, providing insight into their inner workings and helping to build trust, improve transparency, and meet regulatory requirements in various domains. Improving model explainability not only helps to understand model behavior, but also helps to identify biases, improve model performance, and enable stakeholders to make more informed decisions based on machine learning insights.
The skforecast library is compatible with some of the most used interpretability methods: Shap values, Partial Dependency Plots and Model-specific methods.
Libraries
Libraries used in this document.
# Data manipulation
# ==============================================================================
import pandas as pd
import numpy as np
from skforecast.datasets import fetch_dataset
# Plotting
# ==============================================================================
import matplotlib.pyplot as plt
import shap
from skforecast.plot import set_dark_theme
# Modeling and forecasting
# ==============================================================================
import sklearn
import lightgbm
import skforecast
from sklearn.inspection import PartialDependenceDisplay
from lightgbm import LGBMRegressor
from skforecast.recursive import ForecasterRecursive
from skforecast.preprocessing import RollingFeatures
from skforecast.model_selection import backtesting_forecaster, TimeSeriesFold
color = '\033[1m\033[38;5;208m'
print(f"{color}Version skforecast: {skforecast.__version__}")
print(f"{color}Version scikit-learn: {sklearn.__version__}")
print(f"{color}Version lightgbm: {lightgbm.__version__}")
print(f"{color}Version shap: {shap.__version__}")
Version skforecast: 0.16.0 Version scikit-learn: 1.6.1 Version lightgbm: 4.6.0 Version shap: 0.47.2
Data
# Download data
# ==============================================================================
data = fetch_dataset(name="vic_electricity")
data.head(3)
vic_electricity --------------- Half-hourly electricity demand for Victoria, Australia O'Hara-Wild M, Hyndman R, Wang E, Godahewa R (2022).tsibbledata: Diverse Datasets for 'tsibble'. https://tsibbledata.tidyverts.org/, https://github.com/tidyverts/tsibbledata/. https://tsibbledata.tidyverts.org/reference/vic_elec.html Shape of the dataset: (52608, 4)
Demand | Temperature | Date | Holiday | |
---|---|---|---|---|
Time | ||||
2011-12-31 13:00:00 | 4382.825174 | 21.40 | 2012-01-01 | True |
2011-12-31 13:30:00 | 4263.365526 | 21.05 | 2012-01-01 | True |
2011-12-31 14:00:00 | 4048.966046 | 20.70 | 2012-01-01 | True |
# Aggregation to daily frequency
# ==============================================================================
data = data.resample('D').agg({'Demand': 'sum', 'Temperature': 'mean'})
data.head(3)
Demand | Temperature | |
---|---|---|
Time | ||
2011-12-31 | 82531.745918 | 21.047727 |
2012-01-01 | 227778.257304 | 26.578125 |
2012-01-02 | 275490.988882 | 31.751042 |
# Create calendar variables
# ==============================================================================
data['day_of_week'] = data.index.dayofweek
data['month'] = data.index.month
data.head(3)
Demand | Temperature | day_of_week | month | |
---|---|---|---|---|
Time | ||||
2011-12-31 | 82531.745918 | 21.047727 | 5 | 12 |
2012-01-01 | 227778.257304 | 26.578125 | 6 | 1 |
2012-01-02 | 275490.988882 | 31.751042 | 0 | 1 |
# Split train-test
# ==============================================================================
end_train = '2014-12-01 23:59:00'
data_train = data.loc[: end_train, :]
data_test = data.loc[end_train:, :]
print(f"Dates train : {data_train.index.min()} --- {data_train.index.max()} (n={len(data_train)})")
print(f"Dates test : {data_test.index.min()} --- {data_test.index.max()} (n={len(data_test)})")
Dates train : 2011-12-31 00:00:00 --- 2014-12-01 00:00:00 (n=1067) Dates test : 2014-12-02 00:00:00 --- 2014-12-31 00:00:00 (n=30)
Forecasting model
A forecasting model is created to predict the energy demand using the past 7 values (last week) and the temperature as an exogenous variable.
# Create a recursive multi-step forecaster (ForecasterRecursive)
# ==============================================================================
window_features = RollingFeatures(stats=['mean'], window_sizes=24)
exog_features = ['Temperature', 'day_of_week', 'month']
forecaster = ForecasterRecursive(
regressor = LGBMRegressor(random_state=123, verbose=-1),
lags = 7,
window_features = window_features
)
forecaster.fit(
y = data_train['Demand'],
exog = data_train[exog_features],
)
forecaster
ForecasterRecursive
General Information
- Regressor: LGBMRegressor
- Lags: [1 2 3 4 5 6 7]
- Window features: ['roll_mean_24']
- Window size: 24
- Series name: Demand
- Exogenous included: True
- Weight function included: False
- Differentiation order: None
- Creation date: 2025-05-14 22:23:19
- Last fit date: 2025-05-14 22:23:19
- Skforecast version: 0.16.0
- Python version: 3.12.9
- Forecaster id: None
Exogenous Variables
-
Temperature, day_of_week, month
Data Transformations
- Transformer for y: None
- Transformer for exog: None
Training Information
- Training range: [Timestamp('2011-12-31 00:00:00'), Timestamp('2014-12-01 00:00:00')]
- Training index type: DatetimeIndex
- Training index frequency: D
Regressor Parameters
-
{'boosting_type': 'gbdt', 'class_weight': None, 'colsample_bytree': 1.0, 'importance_type': 'split', 'learning_rate': 0.1, 'max_depth': -1, 'min_child_samples': 20, 'min_child_weight': 0.001, 'min_split_gain': 0.0, 'n_estimators': 100, 'n_jobs': None, 'num_leaves': 31, 'objective': None, 'random_state': 123, 'reg_alpha': 0.0, 'reg_lambda': 0.0, 'subsample': 1.0, 'subsample_for_bin': 200000, 'subsample_freq': 0, 'verbose': -1}
Fit Kwargs
-
{}
Model-specific feature importances
Feature importance in machine learning determines the relevance or importance of each feature (or variable) in a model's prediction. In other words, it measures how much each feature contributes to the model's output.
Feature importance can be used for several purposes, such as identifying the most relevant features for a given prediction, understanding the behavior of a model, and selecting the best set of features for a given task. It can also help to identify potential biases or errors in the data used to train the model. It is important to note that feature importance is not a definitive measure of causality. Just because a feature is identified as important does not necessarily mean that it caused the outcome. Other factors, such as confounding variables, may also be at play.
The method used to calculate feature importance may vary depending on the type of machine learning model used. Different models can have different assumptions and characteristics that affect the importance calculation. For example, decision tree-based models, such as Random Forest and Gradient Boosting, typically use methods that measure the reduction of impurities or the effect of permutations. Linear regression models typically use coefficients. The magnitude of the coefficient reflects the strength and direction of the relationship between the predictor and the target variable.
The importance of the predictors included in a forecaster can be obtained using the method get_feature_importances()
. This method accesses the coef_
and feature_importances_
attributes of the internal regressor.
⚠ Warning
Theget_feature_importances()
method will only return values if the forecaster's regressor has either the coef_
or feature_importances_
attribute, which is the default in scikit-learn.
# Extract feature importance
# ==============================================================================
importance = forecaster.get_feature_importances()
importance
feature | importance | |
---|---|---|
8 | Temperature | 568 |
0 | lag_1 | 426 |
1 | lag_2 | 291 |
6 | lag_7 | 248 |
4 | lag_5 | 243 |
2 | lag_3 | 236 |
7 | roll_mean_24 | 227 |
10 | month | 224 |
5 | lag_6 | 200 |
3 | lag_4 | 177 |
9 | day_of_week | 160 |
SHAP explanations for skforecast models
SHAP (SHapley Additive exPlanations) values are a widely adopted method for explaining machine learning models. They provide both visual and quantitative insights into how features and their values impact the model. SHAP values serve two primary purposes:
Global Interpretability: SHAP values help identify how each feature influenced the model during training. By averaging SHAP values across the dataset, one can rank features by their overall importance and gain insight into the model’s decision-making process.
Local Interpretability: SHAP values also explain individual predictions by indicating how much each feature contributed to a specific output. This enables a breakdown of single predictions to understand the role each feature played in the outcome.
SHAP value explanations can be generated for skforecast models using two essential components:
The internal regressor of the forecaster, accessible via
forecaster.regressor
.The internal matrices used for fitting, backtesting, and predicting with the forecaster. These matrices are accessible through the method
create_predict_X()
and by setting the argumentreturn_predictors=True
in thebacktesting_forecaster()
method.
By leveraging these elements, users can produce clear and interpretable explanations for their forecasting models. These explanations can be used to assess model reliability, identify the most influential features, and better understand the relationships between input variables and the target variable.
SHAP feature importance in the overall model
Averaging the shap values across the data set used to train the model, it is possible to obtain an estimation of the contribution (magnitude and direction) of each feature in the model. The higher the absolute value of the SHAP value, the more important the feature is for the model. The sign of the SHAP value indicates whether the feature has a positive or negative impact on the prediction.
First, the training matrices used to fit the model are created with the method create_train_X_y()
.
# Training matrices used by the forecaster to fit the internal regressor
# ==============================================================================
X_train, y_train = forecaster.create_train_X_y(
y = data_train['Demand'],
exog = data_train[exog_features],
)
display(X_train.head(3)) # Features
display(y_train.head(3)) # Target
lag_1 | lag_2 | lag_3 | lag_4 | lag_5 | lag_6 | lag_7 | roll_mean_24 | Temperature | day_of_week | month | |
---|---|---|---|---|---|---|---|---|---|---|---|
Time | |||||||||||
2012-01-24 | 280188.298774 | 239810.374218 | 207949.859910 | 225035.325476 | 240187.677944 | 247722.494256 | 292458.685446 | 222658.202570 | 26.611458 | 1.0 | 1.0 |
2012-01-25 | 287474.816646 | 280188.298774 | 239810.374218 | 207949.859910 | 225035.325476 | 240187.677944 | 247722.494256 | 231197.497184 | 19.759375 | 2.0 | 1.0 |
2012-01-26 | 239083.684380 | 287474.816646 | 280188.298774 | 239810.374218 | 207949.859910 | 225035.325476 | 240187.677944 | 231668.556646 | 20.038542 | 3.0 | 1.0 |
Time 2012-01-24 287474.816646 2012-01-25 239083.684380 2012-01-26 214239.304588 Freq: D, Name: y, dtype: float64
Then, the SHAP values are calculated using the shap
library. The shap_values()
method is used to calculate the SHAP values for the training data. If the data set is large, it is recommended to use only a random sample.
# Create SHAP explainer (for three base models)
# ==============================================================================
explainer = shap.TreeExplainer(forecaster.regressor)
# Sample 50% of the data to speed up the calculation
rng = np.random.default_rng(seed=785412)
sample = rng.choice(X_train.index, size=int(len(X_train)*0.5), replace=False)
X_train_sample = X_train.loc[sample, :]
shap_values = explainer.shap_values(X_train_sample)
✎ Note
Shap library has several explainers, each designed for a different type of model. Theshap.TreeExplainer
explainer is used for tree-based models, such as the LGBMRegressor
used in this example. For more information, see the SHAP documentation.
Once the SHAP values are calculated, several plots can be generated to visualize the results.
SHAP Summary Plot
The SHAP summary plot typically displays the feature importance or contribution of each feature to the model's output across multiple data points. It shows how much each feature contributes to pushing the model's prediction away from a base value (often the model's average prediction). By examining a SHAP summary plot, one can gain insights into which features have the most significant impact on predictions, whether they positively or negatively influence the outcome, and how different feature values contribute to specific predictions.
# Shap summary plot (top 10)
# ==============================================================================
shap.initjs()
shap.summary_plot(shap_values, X_train_sample, max_display=10, show=False)
fig, ax = plt.gcf(), plt.gca()
ax.set_title("SHAP Summary plot")
ax.tick_params(labelsize=8)
fig.set_size_inches(6, 3)
shap.summary_plot(shap_values, X_train, plot_type="bar", plot_size=(6, 3))
SHAP Dependence Plots
SHAP dependence plots are visualizations used to understand the relationship between a feature and the model output by displaying how the value of a single feature affects predictions made by the model while considering interactions with other features. These plots are particularly useful for examining how a certain feature impacts the model's predictions across its range of values while considering interactions with other variables.
# Dependence plot for Temperature
# ==============================================================================
fig, ax = plt.subplots(figsize=(6, 3))
shap.dependence_plot("Temperature", shap_values, X_train_sample, ax=ax)
SHAP Explanations for Individual Predictions
SHAP values not only allow for interpreting the general behavior of the model (Global Interpretability) but also serve as a powerful tool for analyzing individual predictions (Local Interpretability). This is especially useful when trying to understand why a model made a specific prediction for a given instance.
To carry out this analysis, it is necessary to access the predictor values — lags and exogenous variables — at the time of the prediction. This can be achieved by using the create_predict_X
method or by enabling the return_predictors=True
argument in the backtesting_forecaster
function.
SHAP values of predict()
output
Suppose the forecaster is employed to predict the next 30 values of the series, and a specific prediction corresponding to the date '2014-12-28' requires explanation.
# Forecasting next 30 days
# ==============================================================================
predictions = forecaster.predict(steps=30, exog=data_test[exog_features])
set_dark_theme()
fig, ax = plt.subplots(figsize=(6, 2.5))
data_test['Demand'].plot(ax=ax, label='Test')
predictions.plot(ax=ax, label='Predictions', linestyle='--')
ax.set_xlabel(None)
ax.legend();
The method create_predict_X
is used to create the input matrix used internally by the forecaster's predict
method. This matrix is then used to generate SHAP values for the forecasted values.
# Create input matrix used to forecast the next 30 steps
# ==============================================================================
X_predict = forecaster.create_predict_X(steps=30, exog=data_test[exog_features])
X_predict.head(3)
lag_1 | lag_2 | lag_3 | lag_4 | lag_5 | lag_6 | lag_7 | roll_mean_24 | Temperature | day_of_week | month | |
---|---|---|---|---|---|---|---|---|---|---|---|
2014-12-02 | 237812.592388 | 234970.336660 | 189653.758108 | 202017.012448 | 214602.854760 | 218321.456402 | 214318.765210 | 211369.709659 | 19.833333 | 1.0 | 12.0 |
2014-12-03 | 230878.900870 | 237812.592388 | 234970.336660 | 189653.758108 | 202017.012448 | 214602.854760 | 218321.456402 | 212777.981610 | 19.616667 | 2.0 | 12.0 |
2014-12-04 | 230782.656189 | 230878.900870 | 237812.592388 | 234970.336660 | 189653.758108 | 202017.012448 | 214602.854760 | 214485.198829 | 21.702083 | 3.0 | 12.0 |
# SHAP values for the predictions
# ==============================================================================
shap_values = explainer.shap_values(X_predict)
# Waterfall plot for a single prediction
# ==============================================================================
predicted_date = '2014-12-28'
iloc_predicted_date = X_predict.index.get_loc(predicted_date)
shap_values_single = explainer(X_predict)
shap.plots.waterfall(shap_values_single[iloc_predicted_date], show=False)
fig, ax = plt.gcf(), plt.gca()
fig.set_size_inches(6, 3.5)
ax_list = fig.axes
ax = ax_list[0]
ax.tick_params(labelsize=10)
ax.set
plt.show()
The waterfall plot illustrates how different features pushed the model’s output higher (shown in red) or lower (shown in blue), relative to the average model prediction.
lag_1
had the largest negative impact, reducing the prediction by over 22,000 units.Temperature
contributed positively, increasing the prediction by around 7,685 units.Other features like
day_of_week
,month
, and various lag values also influenced the prediction to a lesser degree.
The model prediction (f(x)
) was 200,849.86, while the expected value (E[f(x)]
) across all predictions is 224,358.59. This means the specific inputs for this prediction led the model to forecast a value lower than average, largely due to the strong negative impact of lag_1
.
Same insights can be obtained using the shap.force_plot
method.
# Forceplot for a single prediction
# ==============================================================================
shap.force_plot(
base_value = explainer.expected_value,
shap_values = shap_values_single.values[iloc_predicted_date],
features = X_predict.iloc[iloc_predicted_date, :]
)
# Force plot for the 30 predictions
# ==============================================================================
shap.force_plot(
base_value = explainer.expected_value,
shap_values = shap_values,
features = X_predict
)
SHAP values of backtesting_forecaster()
output
The analysis of individual predictions using SHAP values can be also apllied to predictions made in a backtesting process. For that, the return_predictors=True
argument must be set in the backtesting_forecaster
method. This will return a DataFrame with the predicted value ('pred'), the partition it belongs to ('fold'), and the value of the lags and exogenous variables used to make each prediction.
In this scenario, a backtesting process is employed to train the model using data up to '2014-12-01 23:59:00'. The model then generates predictions in folds of 24 steps. SHAP values are subsequently computed for the forecast corresponding to the date '2014-12-16'.
# Backtesting returning the predictors
# ==============================================================================
cv = TimeSeriesFold(steps= 24, initial_train_size = len(data.loc[:'2014-12-01 23:59:00']))
_, predictions = backtesting_forecaster(
forecaster = forecaster,
y = data['Demand'],
exog = data[exog_features],
cv = cv,
metric = 'mean_absolute_error',
return_predictors = True,
)
predictions.head(3)
0%| | 0/2 [00:00<?, ?it/s]
pred | fold | lag_1 | lag_2 | lag_3 | lag_4 | lag_5 | lag_6 | lag_7 | roll_mean_24 | Temperature | day_of_week | month | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2014-12-02 | 230878.900870 | 0 | 237812.592388 | 234970.336660 | 189653.758108 | 202017.012448 | 214602.854760 | 218321.456402 | 214318.765210 | 211369.709659 | 19.833333 | 1.0 | 12.0 |
2014-12-03 | 230782.656189 | 0 | 230878.900870 | 237812.592388 | 234970.336660 | 189653.758108 | 202017.012448 | 214602.854760 | 218321.456402 | 212777.981610 | 19.616667 | 2.0 | 12.0 |
2014-12-04 | 237992.220195 | 0 | 230782.656189 | 230878.900870 | 237812.592388 | 234970.336660 | 189653.758108 | 202017.012448 | 214602.854760 | 214485.198829 | 21.702083 | 3.0 | 12.0 |
# Waterfall for a single prediction generated during backtesting
# ==============================================================================
predictions = predictions.astype(data[exog_features].dtypes) # Ensure that the types are the same
iloc_predicted_date = predictions.index.get_loc('2014-12-16')
shap_values_single = explainer(predictions.iloc[:, 2:])
shap.plots.waterfall(shap_values_single[iloc_predicted_date], show=False)
fig, ax = plt.gcf(), plt.gca()
fig.set_size_inches(6, 3.5)
ax_list = fig.axes
ax = ax_list[0]
ax.tick_params(labelsize=8)
ax.set
plt.show()
Scikit-learn partial dependence plots
Partial dependence plots (PDPs) are a useful tool for understanding the relationship between a feature and the target outcome in a machine learning model. In scikit-learn, you can create partial dependence plots using the plot_partial_dependence
function. This function visualizes the effect of one or two features on the predicted outcome, while marginalizing the effect of all other features.
The resulting plots show how changes in the selected feature(s) affect the predicted outcome while holding other features constant on average. Remember that these plots should be interpreted in the context of your model and data. They provide insight into the relationship between specific features and the model's predictions.
A more detailed description of the Partial Dependency Plot can be found in Scikitlearn's User Guides.
# Scikit-learn partial dependence plots
# ==============================================================================
fig, ax = plt.subplots(figsize=(8, 3))
pd.plots = PartialDependenceDisplay.from_estimator(
estimator = forecaster.regressor,
X = X_train,
features = ["Temperature", "lag_1"],
kind = 'both',
ax = ax,
)
ax.set_title("Partial Dependence Plot")
fig.tight_layout()
plt.show()
Session information
import session_info
session_info.show(html=False)
----- lightgbm 4.6.0 matplotlib 3.10.1 numpy 2.2.5 pandas 2.2.3 session_info v1.0.1 shap 0.47.2 skforecast 0.16.0 sklearn 1.6.1 ----- IPython 9.1.0 jupyter_client 8.6.3 jupyter_core 5.7.2 notebook 6.5.7 ----- Python 3.12.9 | packaged by Anaconda, Inc. | (main, Feb 6 2025, 18:56:27) [GCC 11.2.0] Linux-6.11.0-25-generic-x86_64-with-glibc2.39 ----- Session information updated at 2025-05-14 22:23
Citation
How to cite this document
If you use this document or any part of it, please acknowledge the source, thank you!
Interpretable forecasting models by Joaquín Amat Rodrigo and Javier Escobar Ortiz, available under a Attribution-NonCommercial-ShareAlike 4.0 International at https://www.cienciadedatos.net/documentos/py57-interpretable-forecasting-models.html
How to cite skforecast
If you use skforecast for a publication, we would appreciate it if you cite the published software.
Zenodo:
Amat Rodrigo, Joaquin, & Escobar Ortiz, Javier. (2025). skforecast (v0.16.0). Zenodo. https://doi.org/10.5281/zenodo.8382788
APA:
Amat Rodrigo, J., & Escobar Ortiz, J. (2025). skforecast (Version 0.16.0) [Computer software]. https://doi.org/10.5281/zenodo.8382788
BibTeX:
@software{skforecast, author = {Amat Rodrigo, Joaquin and Escobar Ortiz, Javier}, title = {skforecast}, version = {0.16.0}, month = {05}, year = {2025}, license = {BSD-3-Clause}, url = {https://skforecast.org/}, doi = {10.5281/zenodo.8382788} }
Did you like the article? Your support is important
Your contribution will help me to continue generating free educational content. Many thanks! 😊
This work by Joaquín Amat Rodrigo and Javier Escobar Ortiz is licensed under a Attribution-NonCommercial-ShareAlike 4.0 International.
Allowed:
-
Share: copy and redistribute the material in any medium or format.
-
Adapt: remix, transform, and build upon the material.
Under the following terms:
-
Attribution: You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
-
NonCommercial: You may not use the material for commercial purposes.
-
ShareAlike: If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.