• Introduction
  • Libraries
  • Data
  • XGBoost
  • LightGBM
  • CatBoost
  • RAPIDS cuML
  • Session information
  • Citation


More about forecasting in cienciadedatos.net


Introduction

Traditionally, machine learning algorithms have been executed on CPUs (Central Processing Units)—general-purpose processors capable of handling a wide variety of tasks. However, CPUs are not optimized for the highly parallelized matrix operations that many machine learning algorithms rely on, often resulting in slower training times and limited scalability. In contrast, GPUs (Graphics Processing Units) are specifically designed for parallel processing, capable of performing thousands of simultaneous mathematical operations. This makes them particularly well-suited for training and deploying large-scale machine learning models.

Many popular machine learning libraries have implemented GPU acceleration, including XGBoost, LightGBM, CatBoost and CuML. By leveraging GPU capabilities, these libraries can reduce training times and enhance scalability. The following sections demonstrate how execute skforecast with GPU acceleration and compare its performance versus CPU execution.

✎ Note

The performance advantage of using a GPU depends heavily on the specific task and the size of the dataset. Generally, GPU acceleration offers the greatest benefits when working with large datasets and complex models, where its parallel processing capabilities can significantly reduce training times.

  • In recursive forecasting (ForecasterRecursive and ForecasterRecursiveMultiseries) the prediction phase must be executed sequentially since each time step depends on the previous prediction. This inherent dependency prevents parallelization during inference, which explains why model fitting is substantially faster on a GPU, while prediction can actually be slower compared to using a CPU. To overcome this limitation, skforecast automatically switches the regressor to use the CPU for prediction, even if it was trained on a GPU.

  • Direct forecasters (ForecasterDirect, ForecasterDirectMultivariate) do not rely on previous predictions during inference. This lack of dependency allows both training and prediction to fully benefit from GPU acceleration.

✎ Note

Despite the significant advantages offered by GPUs (specifically Nvidia GPUs) in accelerating machine learning computations, access to them is often limited due to high costs or other practical constraints. Fortunatelly, Google Colaboratory (Colab), a free Jupyter notebook environment, allows users to run Python code in the cloud, with access GPUs. The following links provide access to Google Colab notebooks that demonstrate how to use skforecast with GPU acceleration.

Libraries

The libraries used in this document are:

# Libraries
# ==============================================================================
import numpy as np
import pandas as pd
import torch
import psutil
import xgboost
from xgboost import XGBRegressor
import lightgbm
from lightgbm import LGBMRegressor
import catboost
from catboost import CatBoostRegressor
import cuml
from sklearn.ensemble import RandomForestRegressor
import warnings
import skforecast
from skforecast.recursive import ForecasterRecursive
from skforecast.model_selection import backtesting_forecaster, TimeSeriesFold

print(f"skforecast version : {skforecast.__version__}")
print(f"xgboost version    : {xgboost.__version__}")
print(f"lightgbm version   : {lightgbm.__version__}")
print(f"catboost version   : {catboost.__version__}")
print(f"cuml version       : {cuml.__version__}")
skforecast version : 0.16.0
xgboost version    : 2.1.2
lightgbm version   : 4.5.0
catboost version   : 1.2.8
# Print information abput the GPU and CPU
# ==============================================================================
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print('Using device:', device)

if device.type == 'cuda':
    print(torch.cuda.get_device_name(0))
    print('Memory Usage:')
    print('Allocated :', round(torch.cuda.memory_allocated(0) / 1024**3, 1), 'GB')
    print('Reserved  :', round(torch.cuda.memory_reserved(0) / 1024**3, 1), 'GB')

print(f"CPU RAM Free: {psutil.virtual_memory().available / 1024**3:.2f} GB")
Using device: cuda
NVIDIA T1200 Laptop GPU
Memory Usage:
Allocated : 0.0 GB
Reserved  : 0.0 GB
CPU RAM Free: 13.62 GB

Data

A time series with one million data points is simulated.

# Data
# ==============================================================================
n = 1_000_000
data = pd.Series(
    data  = np.random.normal(size=n), 
    index = pd.date_range(start="1990-01-01", periods=n, freq="h"),
    name  = 'y'
)
data.head(2)
1990-01-01 00:00:00    1.500408
1990-01-01 01:00:00   -1.112868
Freq: h, Name: y, dtype: float64

XGBoost

To run an XGBoost model (version 2.0 or higher) on a GPU, set the argument device='cuda' during initialization.

The follwing sections compare the time taken to fit, predict and backtest a model using XGBoost on a CPU versus a GPU.

# Suppress warnings
# ==============================================================================
warnings.filterwarnings(
    "ignore",
    message=".*Falling back to prediction using DMatrix.*",
    category=UserWarning,
    module="xgboost"
)
# Create and train forecaster with a XGBRegressor using GPU
# ==============================================================================
forecaster = ForecasterRecursive(
                 regressor = XGBRegressor(
                                 n_estimators = 1000,
                                 device       = 'cuda',
                                 verbosity    = 1
                             ),
                 lags = 50
             )

start_time = pd.Timestamp.now()
forecaster.fit(y=data)
elapsed_time = pd.Timestamp.now() - start_time

print(f"Training time using GPU: {elapsed_time}")

# Predict
# ==============================================================================
start_time = pd.Timestamp.now()
forecaster.predict(steps=100)
elapsed_time = pd.Timestamp.now() - start_time

print(f"Prediction time using GPU: {elapsed_time}")

# Backtesting
# ==============================================================================
cv = TimeSeriesFold(
         steps              = 100,
         initial_train_size = 990_000,
         refit              = False,
         verbose            = False
     )
start_time = pd.Timestamp.now()
_ = backtesting_forecaster(
        forecaster = forecaster,
        y          = data,
        cv         = cv,
        metric     = 'mean_absolute_error'

    )
elapsed_time = pd.Timestamp.now() - start_time

print(f"Backtesting time using GPU: {elapsed_time}")
Training time using GPU: 0 days 00:00:28.760755
Prediction time using GPU: 0 days 00:00:00.095367
  0%|          | 0/100 [00:00<?, ?it/s]
Backtesting time using GPU: 0 days 00:00:43.120299
# Create and train forecaster with a XGBRegressor using CPU
# ==============================================================================
forecaster = ForecasterRecursive(
                 regressor = XGBRegressor(n_estimators=1000),
                 lags      = 50
             )

start_time = pd.Timestamp.now()
forecaster.fit(y=data)
elapsed_time = pd.Timestamp.now() - start_time

print(f"Training time using CPU: {elapsed_time}")

# Predict
# ==============================================================================
start_time = pd.Timestamp.now()
forecaster.predict(steps=100)
elapsed_time = pd.Timestamp.now() - start_time

print(f"Prediction time using CPU: {elapsed_time}")

# Backtesting
# ==============================================================================
start_time = pd.Timestamp.now()
_ = backtesting_forecaster(
        forecaster = forecaster,
        y          = data,
        cv         = cv,
        metric     = 'mean_absolute_error'

    )
elapsed_time = pd.Timestamp.now() - start_time

print(f"Backtesting time using CPU: {elapsed_time}")
Training time using CPU: 0 days 00:00:44.323256
Prediction time using CPU: 0 days 00:00:00.143112
  0%|          | 0/100 [00:00<?, ?it/s]
Backtesting time using CPU: 0 days 00:00:48.289869

LightGBM

# Suppress warnings
# ==============================================================================
warnings.filterwarnings(
    "ignore",
    message="'force_all_finite' was renamed to 'ensure_all_finite' in 1.6 and will be removed in 1.8.",
    category=FutureWarning,
    module="sklearn.utils.deprecation"
)

When using Google colab, run the following in a notebook cell to ensure LightGBM can utilize the NVIDIA GPU when executing in google colab.

!mkdir -p /etc/OpenCL/vendors && echo "libnvidia-opencl.so.1" > /etc/OpenCL/vendors/nvidia.icd
# Create and train forecaster with a LGBMRegressor using GPU
# ==============================================================================
forecaster = ForecasterRecursive(
                 regressor = LGBMRegressor(n_estimators=1000, device='gpu', verbose=-1),
                 lags      = 50
             )

start_time = pd.Timestamp.now()
forecaster.fit(y=data)
elapsed_time = pd.Timestamp.now() - start_time

print(f"Training time using GPU: {elapsed_time}")

# Predict
# ==============================================================================
start_time = pd.Timestamp.now()
forecaster.predict(steps=100)
elapsed_time = pd.Timestamp.now() - start_time

print(f"Prediction time using GPU: {elapsed_time}")

# Backtesting
# ==============================================================================
start_time = pd.Timestamp.now()
_ = backtesting_forecaster(
        forecaster = forecaster,
        y          = data,
        cv         = cv,
        metric     = 'mean_absolute_error'

    )
elapsed_time = pd.Timestamp.now() - start_time

print(f"Backtesting time using GPU: {elapsed_time}")
Training time using GPU: 0 days 00:00:21.657198
Prediction time using GPU: 0 days 00:00:00.059898
  0%|          | 0/100 [00:00<?, ?it/s]
Backtesting time using GPU: 0 days 00:00:24.554291
# Create and train forecaster with a LGBMRegressor using CPU
# ==============================================================================
forecaster = ForecasterRecursive(
                 regressor = LGBMRegressor(n_estimators=1000, device='cpu', verbose=-1),
                 lags      = 50
             )

start_time = pd.Timestamp.now()
forecaster.fit(y=data)
elapsed_time = pd.Timestamp.now() - start_time

print(f"Training time using CPU: {elapsed_time}")

# Predict
# ==============================================================================
start_time = pd.Timestamp.now()
forecaster.predict(steps=100)
elapsed_time = pd.Timestamp.now() - start_time

print(f"Prediction time using CPU: {elapsed_time}")

# Backtesting
# ==============================================================================
start_time = pd.Timestamp.now()
_ = backtesting_forecaster(
        forecaster = forecaster,
        y          = data,
        cv         = cv,
        metric     = 'mean_absolute_error'

    )
elapsed_time = pd.Timestamp.now() - start_time

print(f"Backtesting time using CPU: {elapsed_time}")
Training time using CPU: 0 days 00:00:27.196106
Prediction time using CPU: 0 days 00:00:00.044868
  0%|          | 0/100 [00:00<?, ?it/s]
Backtesting time using CPU: 0 days 00:00:25.313712

CatBoost

# Create and train forecaster with a CatBoostRegressor using GPU
# ==============================================================================
forecaster = ForecasterRecursive(
                 regressor = CatBoostRegressor(n_estimators=1000, task_type='GPU', silent=True, allow_writing_files=False),
                 lags      = 50
             )

start_time = pd.Timestamp.now()
forecaster.fit(y=data)
elapsed_time = pd.Timestamp.now() - start_time

print(f"Training time using GPU: {elapsed_time}")

# Predict
# ==============================================================================
start_time = pd.Timestamp.now()
forecaster.predict(steps=100)
elapsed_time = pd.Timestamp.now() - start_time

print(f"Prediction time using GPU: {elapsed_time}")

# Backtesting
# ==============================================================================
start_time = pd.Timestamp.now()
_ = backtesting_forecaster(
        forecaster = forecaster,
        y          = data,
        cv         = cv,
        metric     = 'mean_absolute_error'

    )
elapsed_time = pd.Timestamp.now() - start_time

print(f"Backtesting time using GPU: {elapsed_time}")
Training time using GPU: 0 days 00:00:26.810713
Prediction time using GPU: 0 days 00:00:00.109821
  0%|          | 0/100 [00:00<?, ?it/s]
Backtesting time using GPU: 0 days 00:00:30.199580
# Create and train forecaster with a CatBoostRegressor using CPU
# ==============================================================================
forecaster = ForecasterRecursive(
                 regressor = CatBoostRegressor(n_estimators=1000, task_type='CPU', silent=True, allow_writing_files=False),
                 lags      = 50
             )

start_time = pd.Timestamp.now()
forecaster.fit(y=data)
elapsed_time = pd.Timestamp.now() - start_time

print(f"Training time using CPU: {elapsed_time}")

# Predict
# ==============================================================================
start_time = pd.Timestamp.now()
forecaster.predict(steps=100)
elapsed_time = pd.Timestamp.now() - start_time

print(f"Prediction time using CPU: {elapsed_time}")

# Backtesting
# ==============================================================================
start_time = pd.Timestamp.now()
_ = backtesting_forecaster(
        forecaster = forecaster,
        y          = data,
        cv         = cv,
        metric     = 'mean_absolute_error'

    )
elapsed_time = pd.Timestamp.now() - start_time

print(f"Backtesting time using CPU: {elapsed_time}")
Training time using CPU: 0 days 00:01:13.747194
Prediction time using CPU: 0 days 00:00:00.101679
  0%|          | 0/100 [00:00<?, ?it/s]
Backtesting time using CPU: 0 days 00:01:06.052743

RAPIDS cuML

cuML is a library for executing machine learning algorithms on GPUs with an API that closely follows the scikit-learn API. To use cuML with skforecast, you need to install the RAPIDS cuML library. The installation process may vary depending on your environment and the version of CUDA you have installed. You can find detailed installation instructions in the RAPIDS documentation.

# Create and train forecaster with a RandomForestRegressor using GPU
# ==============================================================================
forecaster = ForecasterRecursive(
                regressor = cuml.ensemble.RandomForestRegressor(
                                n_estimators=200,
                                max_depth=5,
                            ),
                lags = 20
             )

start_time = pd.Timestamp.now()
forecaster.fit(y=data)
elapsed_time = pd.Timestamp.now() - start_time

print(f"Training time using GPU: {elapsed_time}")

# Predict
# ==============================================================================
start_time = pd.Timestamp.now()
forecaster.predict(steps=100)
elapsed_time = pd.Timestamp.now() - start_time

print(f"Prediction time using GPU: {elapsed_time}")

# Backtesting
# ==============================================================================
cv = TimeSeriesFold(
         steps              = 100,
         initial_train_size = 90_000,
         refit              = False,
         verbose            = False
     )
start_time = pd.Timestamp.now()
_ = backtesting_forecaster(
        forecaster = forecaster,
        y          = data,
        cv         = cv,
        metric     = 'mean_absolute_error'

    )
elapsed_time = pd.Timestamp.now() - start_time

print(f"Backtesting time using GPU: {elapsed_time}")
# Create and train forecaster with a RandomForestRegressor using CPU
# ==============================================================================
forecaster = ForecasterRecursive(
                RandomForestRegressor(n_estimators=200, max_depth=5),
                lags = 20
             )

start_time = pd.Timestamp.now()
forecaster.fit(y=data)
elapsed_time = pd.Timestamp.now() - start_time

print(f"Training time using CPU: {elapsed_time}")

# Predict
# ==============================================================================
start_time = pd.Timestamp.now()
forecaster.predict(steps=100)
elapsed_time = pd.Timestamp.now() - start_time

print(f"Prediction time using CPU: {elapsed_time}")

# Backtesting
# ==============================================================================
cv = TimeSeriesFold(
         steps              = 100,
         initial_train_size = 90_000,
         refit              = False,
         verbose            = False
     )
start_time = pd.Timestamp.now()
_ = backtesting_forecaster(
        forecaster = forecaster,
        y          = data,
        cv         = cv,
        metric     = 'mean_absolute_error'

    )
elapsed_time = pd.Timestamp.now() - start_time

print(f"Backtesting time using CPU: {elapsed_time}")

Session information

import session_info
session_info.show(html=False)

Citation

How to cite this document

If you use this document or any part of it, please acknowledge the source, thank you!

Accelerate forecasting models with GPUs by Joaquín Amat Rodrigo and Javier Escobar Ortiz, available under Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0 DEED) at https://cienciadedatos.net/documentos/py65-accelerate-forecasting-models-gpu.html

How to cite skforecast

If you use skforecast for a publication, we would appreciate it if you cite the published software.

Zenodo:

Amat Rodrigo, Joaquin, & Escobar Ortiz, Javier. (2024). skforecast (v0.16.0). Zenodo. https://doi.org/10.5281/zenodo.8382788

APA:

Amat Rodrigo, J., & Escobar Ortiz, J. (2024). skforecast (Version 0.16.0) [Computer software]. https://doi.org/10.5281/zenodo.8382788

BibTeX:

@software{skforecast, author = {Amat Rodrigo, Joaquin and Escobar Ortiz, Javier}, title = {skforecast}, version = {0.16.0}, month = {05}, year = {2025}, license = {BSD-3-Clause}, url = {https://skforecast.org/}, doi = {10.5281/zenodo.8382788} }


Did you like the article? Your support is important

Your contribution will help me to continue generating free educational content. Many thanks! 😊

Become a GitHub Sponsor Become a GitHub Sponsor

Creative Commons Licence

This work by Joaquín Amat Rodrigo and Javier Escobar Ortiz is licensed under a Attribution-NonCommercial-ShareAlike 4.0 International.

Allowed:

  • Share: copy and redistribute the material in any medium or format.

  • Adapt: remix, transform, and build upon the material.

Under the following terms:

  • Attribution: You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.

  • NonCommercial: You may not use the material for commercial purposes.

  • ShareAlike: If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.