More about forecasting in cienciadedatos.net


Introduction

A time series is a succession of chronologically ordered data spaced at equal or unequal intervals. The forecasting process consists of predicting the future value of a time series, either by modeling the series solely based on its past behavior (autoregressive) or by using other external variables.

This guide explores the use of scikit-learn regression models for time series forecasting. Specifically, it introduces skforecast, an intuitive library equipped with essential classes and functions to customize any Scikit-learn regression model to effectively address forecasting challenges.

✎ Note

This document serves as an introductory guide to machine learning based forecasting using skforecast. For more advanced and detailed examples, please explore: These resources delve deeper into diverse applications, offering insights and practical demonstrations of advanced techniques in time series forecasting using machine learning methodologies.

Machine learning for forecasting

In order to apply machine learning models to forecasting problems, the time series has to be transformed into a matrix in which each value is related to the time window (lags) that precedes it.

In a time series context, a lag with respect to a time step $t$ is defined as the values of the series at previous time steps. For example, lag 1 is the value at time step $t − 1$ and lag $m$ is the value at time step $t − m$.


Time series transformation into a matrix of 5 lags and a vector with the value of the series that follows each row of the matrix.

This type of transformation also allows to include additional variables.

Time series transformation including an exogenous variable.

Once data have been rearranged into the new shape, any regression model can be trained to predict the next value (step) of the series. During model training, every row is considered a separate data instance, where values at lags 1, 2, ... $p$ are considered predictors for the target quantity of the time series at time step $t + 1$.

Multi-Step Time Series Forecasting

When working with time series, it is seldom needed to predict only the next element in the series ($t_{+1}$). Instead, the most common goal is to predict a whole future interval (($t_{+1}$), ..., ($t_{+n}$)) or a far point in time ($t_{+n}$). Several strategies allow generating this type of prediction.

Recursive multi-step forecasting

Since the value $t_{n-1}$ is required to predict $t_{n}$, and $t_{n-1}$ is unknown, a recursive process is applied in which, each new prediction, is based on the previous one. This process is known as recursive forecasting or recursive multi-step forecasting and can be easily generated with the ForecasterRecursive class.

Recursive multi-step prediction process diagram to predict 3 steps into the future using the last 4 lags of the series as predictors.


Direct multi-step forecasting

Direct multi-step forecasting consists of training a different model for each step of the forecast horizon. For example, to predict the next 5 values of a time series, 5 different models are trained, one for each step. As a result, the predictions are independent of each other.

Direct multi-step prediction process diagram to predict 3 steps into the future using the last 4 lags of the series as predictors.


The main complexity of this approach is to generate the correct training matrices for each model. The ForecasterDirect class of the skforecast library automates this process. It is also important to bear in mind that this strategy has a higher computational cost since it requires the train of multiple models. The following diagram shows the process for a case in which the response variable and two exogenous variables are available.

Transformation of a time series into matrices to train a direct multi-step forecasting model


Forecasting multi-output

Some machine learning models, such as long short-term memory (LSTM) neural networks, can predict multiple values of a sequence simultaneously (one-shot). This strategy implemented in the ForecasterRnn class of the skforecast library