If you like Skforecast , please give us a star on GitHub! ⭐️
More about forecasting
In univariate time series forecasting, a single time series is modeled as a linear or nonlinear combination of its lags. That is, the past values of the series are used to forecast its future. In multi-series forecasting, two or more time series are modeled together using a single model. Two strategies can be distinguished:
Independent multi-series forecasting
In independent multi-series forecasting a single model is trained for all time series, but each time series remains independent of the others, meaning that past values of one series are not used as predictors of other series. However, modeling them together is useful because the series may follow the same intrinsic pattern regarding their past and future values. For instance, the sales of products A and B in the same store may not be related, but they follow the same dynamics, that of the store.
To predict the next n steps, the strategy of recursive multi-step forecasting is applied, with the only difference being that the series name for which to estimate the predictions needs to be indicated.
Dependent multi-series forecasting (Multivariate forecasting)
In dependent multi-series forecasting (multivariate time series), all series are modeled together in a single model, considering that each time series depends not only on its past values but also on the past values of the other series. The forecaster is expected not only to learn the information of each series separately but also to relate them. An example is the measurements made by all the sensors (flow, temperature, pressure...) installed on an industrial machine such as a compressor.
🖉 Note
ForecasterAutoregMultiSeries
and ForecasterAutoregMultiSeriesCustom
classes cover the use case of independent multi-series forecasting. API Reference
ForecasterAutoregMultiVariate
class covers the use case of dependent multi-series forecasting (Multivariate forecasting). API Reference
Multi-series forecasts do not always outperform single-series forecasts. Which one works best depends largely on the characteristics of the use case to which they are applied. However, the following heuristic is worth keeping in mind:
Advantages of multi-series
It is easier to maintain and monitor a single model than several.
Since all time series are combined during training, the model has a higher learning capacity even if the series are short.
By combining multiple time series, the model can learn more generalizable patterns.
Disadvantages of multi-series-
If the series do not follow the same internal dynamics, the model may learn a pattern that does not represent any of them.
The series may mask each other, so the model may not predict all of them with the same performance.
It is more computationally demanding (time and resources) to train and backtest a big model than several small ones.
The objective of this study is to compare the forecasting performance achieved by a multi-series model versus using a different model for each series.
Data has been obtained from Store Item Demand Forecasting Challenge. This dataset contains 913,000 sales transactions from 2013–01–01 to 2017–12–31 for 50 products (SKU) in 10 stores. The goal is to predict the next 7 days sales for 50 different items in one store using the available 5 years history.
# Libraries
# ======================================================================================
import numpy as np
import pandas as pd
from tqdm import tqdm
import matplotlib.pyplot as plt
plt.style.use('seaborn-v0_8-darkgrid')
from statsmodels.graphics.tsaplots import plot_acf
from sklearn.preprocessing import StandardScaler
from sklearn.ensemble import HistGradientBoostingRegressor
from lightgbm import LGBMRegressor
from skforecast.ForecasterAutoregMultiSeries import ForecasterAutoregMultiSeries
from skforecast.ForecasterAutoreg import ForecasterAutoreg
from skforecast.model_selection import backtesting_forecaster
from skforecast.model_selection import grid_search_forecaster
from skforecast.model_selection_multiseries import backtesting_forecaster_multiseries
from skforecast.model_selection_multiseries import grid_search_forecaster_multiseries
# Data download
# ======================================================================================
data = pd.read_csv('./train_stores_kaggle.csv')
display(data)
print(f"Shape: {data.shape}")
# Data preprocessing
# ======================================================================================
selected_store = 2
selected_items = data.item.unique() # All items
#selected_items = [1, 2, 3, 4, 5] # Selection of items to reduce computation time
data = data[(data['store'] == selected_store) & (data['item'].isin(selected_items))].copy()
data['date'] = pd.to_datetime(data['date'], format='%Y-%m-%d')
data = pd.pivot_table(
data = data,
values = 'sales',
index = 'date',
columns = 'item'
)
data.columns.name = None
data.columns = [f"item_{col}" for col in data.columns]
data = data.asfreq('1D')
data = data.sort_index()
data.head(4)
⚠ Warning
Modeling 50 elements may require significant computational time. Feel free to select only a subset of items to speed up the execution.The dataset is divided into 3 partitions: one for training, one for validation, and one for testing.
# Split data into train-validation-test
# ======================================================================================
end_train = '2016-05-31 23:59:00'
end_val = '2017-05-31 23:59:00'
data_train = data.loc[:end_train, :].copy()
data_val = data.loc[end_train:end_val, :].copy()
data_test = data.loc[end_val:, :].copy()
print(f"Train dates : {data_train.index.min()} --- {data_train.index.max()} (n={len(data_train)})")
print(f"Validation dates : {data_val.index.min()} --- {data_val.index.max()} (n={len(data_val)})")
print(f"Test dates : {data_test.index.min()} --- {data_test.index.max()} (n={len(data_test)})")
Four of the series are plotted to understand their trends and patterns. The reader is strongly encouraged to plot several more to gain an in-depth understanding of the series.
# Plot time series
# ======================================================================================
fig, ax = plt.subplots(figsize=(8, 5))
data.iloc[:, :4].plot(
legend = True,
subplots = True,
sharex = True,
title = 'Sales of store 2',
ax = ax,
)
fig.tight_layout();
# Autocorrelation plot
# ======================================================================================
fig, axes = plt.subplots(nrows=4, ncols=1, figsize=(8, 5), sharex=True)
axes = axes.flat
for i, col in enumerate(data.columns[:4]):
plot_acf(data[col], ax=axes[i], lags=7*5)
axes[i].set_ylim(-1, 1.1)
axes[i].set_title(f'{col}')
fig.tight_layout()
plt.show()
The autocorrelation plots show a clear association between sales on one day and sales on the same day a week earlier. This type of correlation is an indication that autoregressive models can work well.
There is also a common weekly seasonality between the series. The more similar the dynamics between the series, the more likely the model will learn useful patterns.
A different model is trained for each item of the store and its mean absolute error is estimated using backtesting.
# Train and backtest a model for each item: ForecasterAutoreg
# ======================================================================================
items = []
mae_values = []
predictions = {}
for i, item in enumerate(tqdm(data.columns)):
# Define forecaster
forecaster = ForecasterAutoreg(
regressor = HistGradientBoostingRegressor(random_state=123),
lags = 14,
transformer_y = StandardScaler()
)
# Backtesting forecaster
metric, preds = backtesting_forecaster(
forecaster = forecaster,
y = data[item],
initial_train_size = len(data_train) + len(data_val),
steps = 7,
metric = 'mean_absolute_error',
refit = False,
fixed_train_size = False,
verbose = False,
show_progress = False
)
items.append(item)
mae_values.append(metric)
predictions[item] = preds
# Results
uni_series_mae = pd.Series(
data = mae_values,
index = items,
name = 'uni_series_mae'
)
A single multi-series model is trained to predict the sales of each product over the next 7 days. As in the predict
method, the levels
at which backtesting is performed must be indicated. The argument can also be set to None
to perform backtesting at all levels.
# Train and backtest a model for all items: ForecasterAutoregMultiSeries
# ======================================================================================
items = list(data.columns)
# Define forecaster
forecaster_ms = ForecasterAutoregMultiSeries(
regressor = HistGradientBoostingRegressor(random_state=123),
lags = 14,
transformer_series = StandardScaler(),
)
# Backtesting forecaster for all items
multi_series_mae, predictions_ms = backtesting_forecaster_multiseries(
forecaster = forecaster_ms,
series = data,
levels = items,
steps = 7,
metric = 'mean_absolute_error',
initial_train_size = len(data_train) + len(data_val),
refit = False,
fixed_train_size = False,
verbose = False,
show_progress = True )
# Results
display(multi_series_mae.head(3))
print('')
display(predictions_ms.head(3))
# Difference of backtesting metric for each item
# ======================================================================================
multi_series_mae = multi_series_mae.set_index('levels')
multi_series_mae.columns = ['multi_series_mae']
results = pd.concat((uni_series_mae, multi_series_mae), axis = 1)
results['improvement'] = results.eval('uni_series_mae - multi_series_mae')
results['improvement_(%)'] = 100 * results.eval('(uni_series_mae - multi_series_mae) / uni_series_mae')
results = results.round(2)
results.style.bar(subset=['improvement_(%)'], align='mid', color=['#d65f5f', '#5fba7d'])
# Average improvement for all items
# ======================================================================================
results[['improvement', 'improvement_(%)']].agg(['mean', 'min', 'max'])
# Number of series with positive and negative improvement
# ======================================================================================
pd.Series(np.where(results['improvement_(%)'] < 0, 'negative', 'positive')).value_counts()
# Plot the item with maximum improvement
# ======================================================================================
fig, ax=plt.subplots(figsize=(8, 4))
data_test['item_3'].tail(60).plot(ax=ax)
predictions['item_3'].tail(60).plot(ax=ax)
predictions_ms['item_3'].tail(60).plot(ax=ax)
ax.legend(['real', 'predictions single forecaster', 'predictions multi series forecaster'])
ax.set_xlabel('')
ax.set_title('Forecasting for Item 3');
# Plot the item with minimum improvement
# ======================================================================================
fig, ax=plt.subplots(figsize=(8, 4))
data_test['item_16'].tail(60).plot(ax=ax)
predictions['item_16'].tail(60).plot(ax=ax)
predictions_ms['item_16'].tail(60).plot(ax=ax)
ax.legend(['real', 'predictions single forecaster', 'predictions multi series forecaster'])
ax.set_xlabel('')
ax.set_title('Forecasting for Item 16');
If a time series has few observations, the amount of information available for the model to learn is limited. This is a well know problem in real cases where there is not much historical data available. Non multivariate multi-series forecasters combine all the series during training; therefore they can access more data.
In this section, the same comparison between single-series forecasting and multi-series forecasting is performed but, this time, the length of the series is much shorter.
# Split data into train-validation-test
# ======================================================================================
start_train = '2017-01-01 00:00:00'
end_train = '2017-05-01 00:00:00'
end_val = '2017-07-31 23:59:00'
end_test = '2017-09-30 23:59:00'
data = data.loc[start_train:, :].copy()
data_train = data.loc[:end_train, :].copy()
data_val = data.loc[end_train:end_val, :].copy()
data_test = data.loc[end_val:end_test, :].copy()
print(f"Train dates : {data_train.index.min()} --- {data_train.index.max()} (n={len(data_train)})")
print(f"Validation dates : {data_val.index.min()} --- {data_val.index.max()} (n={len(data_val)})")
print(f"Test dates : {data_test.index.min()} --- {data_test.index.max()} (n={len(data_test)})")
# Train and backtest a model for each item
# ======================================================================================
items = []
mae_values = []
predictions = {}
for i, item in enumerate(tqdm(data.columns)):
forecaster = ForecasterAutoreg(
regressor = HistGradientBoostingRegressor(random_state=123),
lags = 14,
transformer_y = StandardScaler()
)
metric, preds = backtesting_forecaster(
forecaster = forecaster,
y = data[item],
initial_train_size = len(data_train) + len(data_val),
steps = 7,
metric = 'mean_absolute_error',
refit = False,
fixed_train_size = False,
verbose = False,
show_progress = False
)
items.append(item)
mae_values.append(metric)
predictions[item] = preds
uni_series_mae = pd.Series(
data = mae_values,
index = items,
name = 'uni_series_mae'
)
# Train and backtest a model for all items
# ======================================================================================
# Define forecaster
forecaster_ms = ForecasterAutoregMultiSeries(
regressor = HistGradientBoostingRegressor(random_state=123),
lags = 14,
transformer_series = StandardScaler(),
)
# Backtesting forecaster for all items
multi_series_mae, predictions_ms = backtesting_forecaster_multiseries(
forecaster = forecaster_ms,
series = data,
levels = None, # If None all levels are selected
steps = 7,
metric = 'mean_absolute_error',
initial_train_size = len(data_train) + len(data_val),
refit = False,
fixed_train_size = False,
verbose = False
)
# Results
display(multi_series_mae.head(3))
print('')
display(predictions_ms.head(3))
# Difference in backtesting metric for each item
# ======================================================================================
multi_series_mae = multi_series_mae.set_index('levels')
multi_series_mae.columns = ['multi_series_mae']
results = pd.concat((uni_series_mae, multi_series_mae), axis = 1)
results['improvement'] = results.eval('uni_series_mae - multi_series_mae')
results['improvement_(%)'] = 100 * results.eval('(uni_series_mae - multi_series_mae) / uni_series_mae')
results = results.round(2)
results
# Average improvement for all items
# ======================================================================================
results[['improvement', 'improvement_(%)']].agg(['mean', 'min', 'max'])
# Number of series with positive and negative improvement
# ======================================================================================
pd.Series(np.where(results['improvement_(%)'] < 0, 'negative', 'positive')).value_counts()
The average improvement has increased from 6.6 to 8.2%. The advantage of using a multi-series forecaster seems to grow as the length of the available series is reduced.
In the previous sections, the comparison between forecasters was done without optimizing the hyperparameters of the regressors. To make a fair comparison, a grid search strategy is used in order to select the best configuration for each forecaster. See more information in hyperparameter tuning and lags selection.
⚠ Warning
The following section may require significant computational time (about 45 minutes). Feel free to select only a subset of items to speed up the execution.# Hide progress bar tqdm
# ======================================================================================
from tqdm import tqdm
from functools import partialmethod
tqdm.__init__ = partialmethod(tqdm.__init__, disable=True)
# Hyperparameter search and backtesting of each item's model
# ======================================================================================
items = []
mae_values = []
lags_grid = [7, 14, 21]
param_grid = {
'max_iter': [100, 500],
'max_depth': [3, 5, 10, None],
'learning_rate': [0.01, 0.1]
}
for i, item in enumerate(data.columns):
forecaster = ForecasterAutoreg(
regressor = HistGradientBoostingRegressor(random_state=123),
lags = 14,
transformer_y = StandardScaler()
)
results_grid = grid_search_forecaster(
forecaster = forecaster,
y = data.loc[:end_val, item],
lags_grid = lags_grid,
param_grid = param_grid,
steps = 7,
metric = 'mean_absolute_error',
initial_train_size = len(data_train),
refit = False,
fixed_train_size = False,
return_best = True,
verbose = False,
show_progress = False
)
metric, preds = backtesting_forecaster(
forecaster = forecaster,
y = data[item],
initial_train_size = len(data_train) + len(data_val),
steps = 7,
metric = 'mean_absolute_error',
refit = False,
fixed_train_size = False,
verbose = False,
show_progress = False
)
items.append(item)
mae_values.append(metric)
uni_series_mae = pd.Series(
data = mae_values,
index = items,
name = 'uni_series_mae'
)
# Hyperparameter search for the multi-series model and backtesting for each item
# ======================================================================================
lags_grid = [7, 14, 21]
param_grid = {
'max_iter': [100, 500],
'max_depth': [3, 5, 10, None],
'learning_rate': [0.01, 0.1]
}
forecaster_ms = ForecasterAutoregMultiSeries(
regressor = HistGradientBoostingRegressor(random_state=123),
lags = 14,
transformer_series = StandardScaler(),
)
results_grid_ms = grid_search_forecaster_multiseries(
forecaster = forecaster_ms,
series = data.loc[:end_val, :],
levels = None, # If None all levels are selected
lags_grid = lags_grid,
param_grid = param_grid,
steps = 7,
metric = 'mean_absolute_error',
initial_train_size = len(data_train),
refit = False,
fixed_train_size = False,
return_best = True,
verbose = False
)
multi_series_mae, predictions_ms = backtesting_forecaster_multiseries(
forecaster = forecaster_ms,
series = data,
levels = None, # If None all levels are selected
steps = 7,
metric = 'mean_absolute_error',
initial_train_size = len(data_train) + len(data_val),
refit = False,
fixed_train_size = False,
verbose = False
)
# Difference in backtesting metric for each item
# ======================================================================================
multi_series_mae = multi_series_mae.set_index('levels')
multi_series_mae.columns = ['multi_series_mae']
results = pd.concat((uni_series_mae, multi_series_mae), axis = 1)
results['improvement'] = results.eval('uni_series_mae - multi_series_mae')
results['improvement_(%)'] = 100 * results.eval('(uni_series_mae - multi_series_mae) / uni_series_mae')
results = results.round(2)
results.style.bar(subset=['improvement_(%)'], align='mid', color=['#d65f5f', '#5fba7d'])
# Average improvement for all items
# ======================================================================================
results[['improvement', 'improvement_(%)']].agg(['mean', 'min', 'max'])
# Number of series with positive and negative improvement
# ======================================================================================
pd.Series(np.where(results['improvement_(%)'] < 0, 'negative', 'positive')).value_counts()
After identifying the combination of lags and hyperparameters that achieve the best predictive performance for each forecaster, more single-series models have achieved higher predictive ability by better generalizing their own data (one item). Even so, the multi-series model provides better results for most of the items.
The weights are used to control the influence that each observation has on the training of the model. ForecasterAutoregMultiseries
accepts two types of weights:
series_weights
controls the relative importance of each series. If a series has twice as much weight as the others, the observations of that series influence the training twice as much. The higher the weight of a series relative to the others, the more the model will focus on trying to learn that series.
weight_func
controls the relative importance of each observation according to its index value. For example, a function that assigns a lower weight to certain dates.
If the two types of weights are indicated, they are multiplied to create the final weights as shown in the figure. The resulting sample_weight
cannot have negative values.
Learn more about weights in multi-series forecasting and weighted time series forecasting with skforecast.
In this example, item_1
has higher relative importance among series (it weighs 3 times more than the rest of the series), and observations between '2013-12-01' and '2014-01-31' are considered non-representative and a weight of 0 is applied to them.
# Weights in Multi-Series
# ======================================================================================
# `series_weights`
series_weights = {'item_1': 3.} # Series not presented in the dict will have weight 1
# A dictionary can be passed to `weight_func` to apply different functions for each series
# If a series is not presented in the dict, it will have weight 1
def custom_weights(index):
"""
Return 0 if index is between '2013-12-01' and '2014-01-31', 1 otherwise.
"""
weights = np.where(
(index >= '2013-12-01') & (index <= '2014-01-31'),
0,
1
)
return weights
forecaster = ForecasterAutoregMultiSeries(
regressor = HistGradientBoostingRegressor(random_state=123),
lags = 14,
transformer_series = StandardScaler(),
transformer_exog = None,
weight_func = custom_weights,
series_weights = series_weights
)
forecaster.fit(series=data)
forecaster.predict(steps=7).head(3)
This use case shows that a multi-series model may have advantages over multiple individual models when forecasting time series that follow similar dynamics. Beyond the potential improvements in forecasting, it is also important to take into consideration the benefit of having only one model to maintain.
import session_info
session_info.show(html=False)
How to cite this document?
Multi-series forecasting with python and skforecast by Joaquín Amat Rodrigo and Javier Escobar Ortiz, available under a CC BY-NC-SA 4.0 at https://www.cienciadedatos.net/documentos/py44-multi-series-forecasting-skforecast.html
Did you like the article? Your support is important
Website maintenance has high cost, your contribution will help me to continue generating free educational content. Many thanks! 😊
This work by Joaquín Amat Rodrigo and Javier Escobar Ortiz is licensed under a Attribution-NonCommercial-ShareAlike 4.0 International.