More about forecasting in cienciadedatos.net

Introduction

In early 2025, Kaggle launched the Forecasting Sticker Sales competition as part of its Playground series (Season 5, Episode 1), providing a hands-on challenge for time series forecasting enthusiasts.

Contestants were tasked with forecasting monthly sales for five Kaggle-branded products across six countries and three store types - resulting in 90 different time series. The forecasting period spanned from 2017 to 2019, with historical sales data from 2010 to 2016 available for training. Performance was evaluated using Mean Absolute Percentage Error (MAPE), highlighting the importance of accurate predictions across a diverse and moderately granular dataset.

This document serves as an introductory example of how to use the skforecast Python library to build predictive models and generate predictions. It will guide the reader through the main steps of a typical forecasting project, including data preparation, model training, and evaluation. While the focus is on simplicity and clarity, achieving peak performance would require more advanced feature engineering and iterative model refinement.

Libraries

The libraries used in this notebook are:

# Data management
# ==============================================================================
import pandas as pd
import numpy as np
from itertools import product

# Plots
# ==============================================================================
from matplotlib import pyplot as plt
from skforecast.plot import set_dark_theme

# Modelling
# ==============================================================================
from skforecast.recursive import ForecasterRecursiveMultiSeries
from skforecast.model_selection import  bayesian_search_forecaster_multiseries, OneStepAheadFold
from skforecast.preprocessing import series_long_to_dict, exog_long_to_dict
from sklearn.metrics import mean_absolute_percentage_error
from sklearn.compose import make_column_transformer
from sklearn.preprocessing import OrdinalEncoder
from lightgbm import LGBMRegressor
import holidays
from feature_engine.datetime import DatetimeFeatures
from feature_engine.creation import CyclicalFeatures

# Warnings
# ==============================================================================
import warnings
from skforecast.exceptions import MissingValuesWarning
warnings.simplefilter('ignore', category=MissingValuesWarning)

Data

The data sets used in this notebook are from the Kaggle competition: Forecasting Sticker Sales" (Walter Reade and Elizabeth Park, 2025).

# Data
# ==============================================================================
data_train = pd.read_csv('train.csv')
data_test = pd.read_csv('test.csv')
display(data_train.head())
display(data_test.head())

	id	date	country	store	product	num_sold
0	0	2010-01-01	Canada	Discount Stickers	Holographic Goose	NaN
1	1	2010-01-01	Canada	Discount Stickers	Kaggle	973.0
2	2	2010-01-01	Canada	Discount Stickers	Kaggle Tiers	906.0
3	3	2010-01-01	Canada	Discount Stickers	Kerneler	423.0
4	4	2010-01-01	Canada	Discount Stickers	Kerneler Dark Mode	491.0

	id	date	country	store	product
0	230130	2017-01-01	Canada	Discount Stickers	Holographic Goose
1	230131	2017-01-01	Canada	Discount Stickers	Kaggle
2	230132	2017-01-01	Canada	Discount Stickers	Kaggle Tiers
3	230133	2017-01-01	Canada	Discount Stickers	Kerneler
4	230134	2017-01-01	Canada	Discount Stickers	Kerneler Dark Mode

# Convert 'date' column to datetime
# ==============================================================================
data_train['date'] = pd.to_datetime(data_train['date'])
data_test['date'] = pd.to_datetime(data_test['date'])

# Create a new column 'unique_id' that identifies each time series as the combination
# of columns 'country', 'store', and 'product'
# ==============================================================================
data_train['unique_id'] = (
    data_train['country'] + '_' +
    data_train['store'] + '_' +
    data_train['product']
).replace(' ', '_')
data_test['unique_id'] = (
    data_test['country'] + '_' +
    data_test['store'] + '_' +
    data_test['product']
).replace(' ', '_')

display(data_train.head())
display(data_test.head())

	id	date	country	store	product	num_sold	unique_id
0	0	2010-01-01	Canada	Discount Stickers	Holographic Goose	NaN	Canada_Discount Stickers_Holographic Goose
1	1	2010-01-01	Canada	Discount Stickers	Kaggle	973.0	Canada_Discount Stickers_Kaggle
2	2	2010-01-01	Canada	Discount Stickers	Kaggle Tiers	906.0	Canada_Discount Stickers_Kaggle Tiers
3	3	2010-01-01	Canada	Discount Stickers	Kerneler	423.0	Canada_Discount Stickers_Kerneler
4	4	2010-01-01	Canada	Discount Stickers	Kerneler Dark Mode	491.0	Canada_Discount Stickers_Kerneler Dark Mode

	id	date	country	store	product	unique_id
0	230130	2017-01-01	Canada	Discount Stickers	Holographic Goose	Canada_Discount Stickers_Holographic Goose
1	230131	2017-01-01	Canada	Discount Stickers	Kaggle	Canada_Discount Stickers_Kaggle
2	230132	2017-01-01	Canada	Discount Stickers	Kaggle Tiers	Canada_Discount Stickers_Kaggle Tiers
3	230133	2017-01-01	Canada	Discount Stickers	Kerneler	Canada_Discount Stickers_Kerneler
4	230134	2017-01-01	Canada	Discount Stickers	Kerneler Dark Mode	Canada_Discount Stickers_Kerneler Dark Mode

Range of available dates

# Unique conuntries, stores and products
# ==============================================================================
print('Number of unique time series:', data_train['unique_id'].nunique())
print('Unique countries :', data_train['country'].unique())
print('Unique stores    :', data_train['store'].unique())
print('Unique products  :', data_train['product'].unique())

Unique countries: ['Canada' 'Finland' 'Italy' 'Kenya' 'Norway' 'Singapore']
Unique stores: ['Discount Stickers' 'Stickers for Less' 'Premium Sticker Mart']
Unique products: ['Holographic Goose' 'Kaggle' 'Kaggle Tiers' 'Kerneler'
 'Kerneler Dark Mode']
Number of unique time series: 90

# Date range in the training and test sets
# ==============================================================================
print('Date range in the training set :', data_train['date'].min(), 'to', data_train['date'].max())
print('Date range in the test set     :', data_test['date'].min(), 'to', data_test['date'].max())

Date range in the training set: 2010-01-01 00:00:00 to 2016-12-31 00:00:00
Date range in the test set    : 2017-01-01 00:00:00 to 2019-12-31 00:00:00

# Date range available in the training set for each time series
# ==============================================================================
date_range = data_train.groupby('unique_id')['date'].agg(['min', 'max', 'count'])
date_range = date_range.rename(columns={'min': 'start_date', 'max': 'end_date'})
date_range

	start_date	end_date	count
unique_id
Canada_Discount Stickers_Holographic Goose	2010-01-01	2016-12-31	2557
Canada_Discount Stickers_Kaggle	2010-01-01	2016-12-31	2557
Canada_Discount Stickers_Kaggle Tiers	2010-01-01	2016-12-31	2557
Canada_Discount Stickers_Kerneler	2010-01-01	2016-12-31	2557
Canada_Discount Stickers_Kerneler Dark Mode	2010-01-01	2016-12-31	2557
...	...	...	...
Singapore_Stickers for Less_Holographic Goose	2010-01-01	2016-12-31	2557
Singapore_Stickers for Less_Kaggle	2010-01-01	2016-12-31	2557
Singapore_Stickers for Less_Kaggle Tiers	2010-01-01	2016-12-31	2557
Singapore_Stickers for Less_Kerneler	2010-01-01	2016-12-31	2557
Singapore_Stickers for Less_Kerneler Dark Mode	2010-01-01	2016-12-31	2557

90 rows × 3 columns

# Plot 3 time series
# ==============================================================================
set_dark_theme()
series_to_plot = [
    'Italy_Stickers for Less_Kerneler Dark Mode',
    'Singapore_Stickers for Less_Holographic Goose',
    'Italy_Stickers for Less_Kaggle'
]

for series in series_to_plot:
    fig, ax = plt.subplots(1, 1, figsize=(7, 2.5))
    data_train.query('unique_id == @series').plot(
        x='date',
        y=['num_sold'],
        ax=ax,
        title=series,
        linewidth=0.3,
        legend=False,
    )
    plt.show()

Missing values in target variable

Only 9 series have missing values in the target variable num_sold, most of them are in the "Holographic Goose" product. Since no information is given about the missing values in the challenge description, it is necessary to take a decision about how to handle them:

Missing values means that the product was not sold in that month, meaning that the sales are 0.
Missing values means that the sales of the product are unknown, so they could be 0 or any other value.

# Ensure all time series are complete without intermediate gaps
# ==============================================================================
data_train = (
    data_train
    .groupby('unique_id')
    .apply(lambda group: group.set_index('date').asfreq('D', fill_value=np.nan), include_groups=False)
    .reset_index()
)
data_train

	unique_id	date	id	country	store	product	num_sold
0	Canada_Discount Stickers_Holographic Goose	2010-01-01	0	Canada	Discount Stickers	Holographic Goose	NaN
1	Canada_Discount Stickers_Holographic Goose	2010-01-02	90	Canada	Discount Stickers	Holographic Goose	NaN
2	Canada_Discount Stickers_Holographic Goose	2010-01-03	180	Canada	Discount Stickers	Holographic Goose	NaN
3	Canada_Discount Stickers_Holographic Goose	2010-01-04	270	Canada	Discount Stickers	Holographic Goose	NaN
4	Canada_Discount Stickers_Holographic Goose	2010-01-05	360	Canada	Discount Stickers	Holographic Goose	NaN
...	...	...	...	...	...	...	...
230125	Singapore_Stickers for Less_Kerneler Dark Mode	2016-12-27	229764	Singapore	Stickers for Less	Kerneler Dark Mode	1016.0
230126	Singapore_Stickers for Less_Kerneler Dark Mode	2016-12-28	229854	Singapore	Stickers for Less	Kerneler Dark Mode	1062.0
230127	Singapore_Stickers for Less_Kerneler Dark Mode	2016-12-29	229944	Singapore	Stickers for Less	Kerneler Dark Mode	1178.0
230128	Singapore_Stickers for Less_Kerneler Dark Mode	2016-12-30	230034	Singapore	Stickers for Less	Kerneler Dark Mode	1357.0
230129	Singapore_Stickers for Less_Kerneler Dark Mode	2016-12-31	230124	Singapore	Stickers for Less	Kerneler Dark Mode	1312.0

230130 rows × 7 columns

# Percentaje of missing values in each series
# ==============================================================================
missing_pct = (
    data_train
    .groupby('unique_id')
    .apply(lambda group: group['num_sold'].isna().mean() * 100, include_groups=False)
    .sort_values(ascending=False)
    .reset_index(name='missing_values_pct')
)
missing_pct.query('missing_values_pct > 0')

	unique_id	missing_values_pct
0	Canada_Discount Stickers_Holographic Goose	100.000000
1	Kenya_Discount Stickers_Holographic Goose	100.000000
2	Kenya_Stickers for Less_Holographic Goose	53.109112
3	Canada_Stickers for Less_Holographic Goose	51.153696
4	Kenya_Premium Sticker Mart_Holographic Goose	25.263981
5	Canada_Premium Sticker Mart_Holographic Goose	14.861165
6	Kenya_Discount Stickers_Kerneler	2.463825
7	Canada_Discount Stickers_Kerneler	0.039108
8	Kenya_Discount Stickers_Kerneler Dark Mode	0.039108

Series containing only missing values are excluded from the training set, as they provide no useful information. For the remaining series, missing values are not imputed because the chosen regressor, LightGBM, is capable of handling NaN values directly.

# Drop time series with 100% of missing values
# ==============================================================================
series_to_drop = ['Canada_Discount Stickers_Holographic Goose', 'Kenya_Discount Stickers_Holographic Goose']
data_train = data_train.query("unique_id not in @series_to_drop").copy()

✎ Note

As described in the discussion of the competition, one can get better results by filling missing values winth a random number between 1 and the minimum value within the country (Kenya = 5, Canada = 200). However, this aproach seems to be based on repited submissions to the competition rather than a solid statistical foundation. The approach taken in this notebook is to leave the missing values as they are, and let the model learn from them. This is a common practice in time series forecasting, as it allows the model to learn from the patterns in the data without introducing artificial values. Nevertheless, the reader can try it with the following code:

# Fill series "Canada_Discount Stickers_Holographic Goose" with random values between 1 and 200
# ==============================================================================
# mask = data_train['unique_id'] == 'Canada_Discount Stickers_Holographic Goose'
# data_train.loc[mask, 'num_sold'] = np.random.randint(1, 200, size=sum(mask))

# Fill series "Kenya_Discount Stickers_Holographic Goose" with random values between 1 and 5
# ==============================================================================
# mask = data_train['unique_id'] == 'Kenya_Discount Stickers_Holographic Goose'
# data_train.loc[mask, 'num_sold'] = np.random.randint(1, 5, size=sum(mask))

Feature engineering

Logarithmic transformation

Series are transformed using the logarithm function. This transformation is particularly useful for series with a long tail, as it reduces the impact of extreme values and helps to normalize the distribution of the data. Furthtermore, it avoid negative predictions, which is a common issue when using machine learning models.

The logarithmic transformation is applied to the target variable num_sold and the resulting values are stored in a new column called log_num_sold. Once the predictions are made, the inverse transformation is applied to obtain the original scale of the data.

# Transform the target variable 'num_sold' to log scale
# ==============================================================================
data_train['log_num_sold'] = np.log1p(data_train['num_sold'])
data_train.head()

	unique_id	date	id	country	store	product	num_sold	log_num_sold
2557	Canada_Discount Stickers_Kaggle	2010-01-01	1	Canada	Discount Stickers	Kaggle	973.0	6.881411
2558	Canada_Discount Stickers_Kaggle	2010-01-02	91	Canada	Discount Stickers	Kaggle	881.0	6.782192
2559	Canada_Discount Stickers_Kaggle	2010-01-03	181	Canada	Discount Stickers	Kaggle	1003.0	6.911747
2560	Canada_Discount Stickers_Kaggle	2010-01-04	271	Canada	Discount Stickers	Kaggle	744.0	6.613384
2561	Canada_Discount Stickers_Kaggle	2010-01-05	361	Canada	Discount Stickers	Kaggle	707.0	6.562444

Exogenous variables

In addition to historical sales data, incorporating additional variables—also known as exogenous variables—can enhance the model's performance. For example, calendar-related variables such as the month, day of the week, and holidays can provide valuable context.

# Generate holidays for each country and year
# ==============================================================================
countries = ['Canada', 'Finland', 'Italy', 'Kenya', 'Norway', 'Singapore']
years = [2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019]
all_holidays = []

for country_code, year in product(countries, years):
    try:
        country_holidays = holidays.country_holidays(country_code, years=year)
        all_holidays.extend([
            {
                'country': country_code,
                'date': pd.to_datetime(date),
                'holiday_name': name
            }
            for date, name in country_holidays.items()
        ])
    except NotImplementedError:
        print(f"Country '{country_code}' is not supported by the holidays library.")

df_holidays = (
    pd.DataFrame(all_holidays)
    .groupby(['country', 'date'], as_index=False)
    .agg({'holiday_name': ', '.join})
)
df_holidays['is_holiday'] = 1
df_holidays.sort_values(by=['country', 'date'], inplace=True)

To account for the delayed impact of holidays on sales, it's recommended to include lagged versions of the holiday variable. This involves creating new indicators that show whether the previous or following day is a holiday.

Since the df_holidays DataFrame only lists holiday dates, it must first be expanded to include all calendar dates for each country before calculating these shifted variables.

# Build complete date range per country
date_range = pd.date_range(start=f"{min(years)}-01-01", end=f"{max(years)}-12-31")
all_dates = pd.MultiIndex.from_product([countries, date_range], names=['country', 'date']).to_frame(index=False)

# Merge with holidays
df_calendar = all_dates.merge(df_holidays, on=['country', 'date'], how='left')
df_calendar['is_holiday'] = df_calendar['is_holiday'].fillna(0).astype(int)

# Create lagged and next features for holidays
for i in [1, 2, 5, 7, 9]:
    df_calendar[f'is_holiday_lag_{i}'] = df_calendar.groupby('country')['is_holiday'].shift(i).fillna(0).astype(int)
    df_calendar[f'is_holiday_next_{i}'] = df_calendar.groupby('country')['is_holiday'].shift(-i).fillna(0).astype(int)
    
df_calendar.head()

	country	date	holiday_name	is_holiday	is_holiday_lag_1	is_holiday_lag_2
0	Canada	2010-01-01	New Year's Day	1	0	0
1	Canada	2010-01-02	NaN	0	1	0
2	Canada	2010-01-03	NaN	0	0	1
3	Canada	2010-01-04	NaN	0	0	0
4	Canada	2010-01-05	NaN	0	0	0

# Add calendar features and encode them with cyclical encoding
# ==============================================================================
features_to_extract = [
    'month',
    'week',
    'day_of_week',
]
calendar_transformer = DatetimeFeatures(
                           variables           = 'date',
                           features_to_extract = features_to_extract,
                           drop_original       = False,
                       )
df_calendar = calendar_transformer.fit_transform(df_calendar)
df_calendar.columns = df_calendar.columns.str.replace('date_', '', regex=False)

features_to_encode = [
    "month",
    "week",
    "day_of_week",
]
max_values = {
    "month": 12,
    "week": 52,
    "day_of_week": 6,
}
cyclical_encoder = CyclicalFeatures(
                       variables     = features_to_encode,
                       max_values    = max_values,
                       drop_original = True
                   )
df_calendar = cyclical_encoder.fit_transform(df_calendar)
df_calendar.head()

	country	date	holiday_name	is_holiday	is_holiday_lag_1	is_holiday_lag_2	month_sin	month_cos	week_sin	week_cos	day_of_week_sin	day_of_week_cos
0	Canada	2010-01-01	New Year's Day	1	0	0	0.5	0.866025	0.120537	0.992709	-8.660254e-01	-0.5
1	Canada	2010-01-02	NaN	0	1	0	0.5	0.866025	0.120537	0.992709	-8.660254e-01	0.5
2	Canada	2010-01-03	NaN	0	0	1	0.5	0.866025	0.120537	0.992709	-2.449294e-16	1.0
3	Canada	2010-01-04	NaN	0	0	0	0.5	0.866025	0.120537	0.992709	0.000000e+00	1.0
4	Canada	2010-01-05	NaN	0	0	0	0.5	0.866025	0.120537	0.992709	8.660254e-01	0.5

# Add all exogenous features to the training set
# ==============================================================================
exog_features = [
    'is_holiday',
    'is_holiday_lag_1',
    'is_holiday_lag_2',
    'is_holiday_lag_5',
    'is_holiday_lag_7',
    'is_holiday_lag_9',
    'is_holiday_next_1',
    'is_holiday_next_2',
    'is_holiday_next_5',
    'month_sin',
    'month_cos',
    'week_sin',
    'week_cos',
    'day_of_week_sin',
    'day_of_week_cos',
    'country',
    'store',
    'product',
]

data_train = data_train.merge(
    right    = df_calendar.drop(columns=['holiday_name']),
    how      = 'left',
    left_on  = ['country', 'date'],
    right_on = ['country', 'date'],
    validate = 'many_to_one'
)
data_train.head()

	unique_id	date	id	country	store	product	num_sold	log_num_sold	is_holiday	is_holiday_lag_1	...	month_sin	month_cos	week_sin	week_cos	day_of_week_sin	day_of_week_cos
0	Canada_Discount Stickers_Kaggle	2010-01-01	1	Canada	Discount Stickers	Kaggle	973.0	6.881411	1	0	...	0.5	0.866025	0.120537	0.992709	-8.660254e-01	-0.5
1	Canada_Discount Stickers_Kaggle	2010-01-02	91	Canada	Discount Stickers	Kaggle	881.0	6.782192	0	1	...	0.5	0.866025	0.120537	0.992709	-8.660254e-01	0.5
2	Canada_Discount Stickers_Kaggle	2010-01-03	181	Canada	Discount Stickers	Kaggle	1003.0	6.911747	0	0	...	0.5	0.866025	0.120537	0.992709	-2.449294e-16	1.0
3	Canada_Discount Stickers_Kaggle	2010-01-04	271	Canada	Discount Stickers	Kaggle	744.0	6.613384	0	0	...	0.5	0.866025	0.120537	0.992709	0.000000e+00	1.0
4	Canada_Discount Stickers_Kaggle	2010-01-05	361	Canada	Discount Stickers	Kaggle	707.0	6.562444	0	0	...	0.5	0.866025	0.120537	0.992709	8.660254e-01	0.5

5 rows × 25 columns

Modeling

Since there are 90 different time series, two different approaches can be taken: modeling each time series separately, known as local forecasting, or modeling all time series together, known as global forecasting.

In this case, a global forecasting approach is taken, which typically better fulfills the requirements of real-world applications, where a large number of time series exist and training a single model for each series is computationally infeasible. Furthermore, the global forecasting models implemented in skforecast are able to forecast new time series that were not present in the training data, which will be useful for predicting the "Holographic Goose" product.

Skforecast accepts different data structures when creating global forecasting models (ForecasterRecursiveMultiSeries):

If all series have the same length and share the same exogenous variables, the data can be passed as a pandas DataFrame where each column represents a time series and each row corresponds to a time step. The DataFrame index should be a datetime index.
If the series have different lengths, the data must be passed as a dictionary. The keys of the dictionary represent the names of the series, and the values are the series themselves. To facilitate this, the series_long_to_dict function can be used—it takes a DataFrame in "long format" and returns a dictionary of pandas Series. Similarly, if the exogenous variables differ (in values or type) across series, the data must also be provided as a dictionary. In this case, the exog_long_to_dict function is used, converting a "long format" DataFrame into a dictionary of exogenous variables (either pandas Series or pandas DataFrames).

In this scenario, the holiday-related exogenous variables differ for each series, as they are specific to the country where the product is sold. Therefore, the data must be passed as a dictionary.

# Transform series and exog to dictionaries
# ==============================================================================
series_dict = series_long_to_dict(
    data      = data_train,
    series_id = 'unique_id',
    index     = 'date',
    values    = 'log_num_sold',
    freq      = 'D'
)

exog_dict = exog_long_to_dict(
    data      = data_train[exog_features + ['date', 'unique_id']],
    series_id = 'unique_id',
    index     = 'date',
    freq      = 'D'
)

When training a forecaster using exogenous variables, it is necessary to provide the exogenous variables for the prediction period. These variables must follow the same structure observed during training. Therefore, the exogenous variables for the test set must also be provided as a dictionary.

# Prepare exogenous variables for the test set
# ==============================================================================
data_test = data_test.merge(
    df_calendar.drop(columns=['holiday_name']),
    how      = 'left',
    left_on  = ['country', 'date'],
    right_on = ['country', 'date'],
    validate = 'many_to_one'
)

exog_dict_pred = exog_long_to_dict(
    data      = data_test[exog_features + ['date', 'unique_id']],
    series_id = 'unique_id',
    index     = 'date',
    freq      = 'D'
)

Feature encoding

The exogenous variables country, store, and product are categorical. Depending on the regressor used, it may be necessary to encode them. In this case, the LightGBM regressor can handle categorical variables directly. However, to ensure they are treated consistently in the training and prediction phases, the variables are encoded first encoded as integers and then stored as Pandas category type. For more details on how to encode exogenous variables, please refer to the Feature Engineering section of the user guide.

# Categorical encoding
# ==============================================================================
# A ColumnTransformer is used to transform categorical (not numerical) features
# using ordinal encoding. Numeric features are left untouched. Missing values
# are coded as -1. If a new category is found in the test set, it is encoded
# as -1.
categorical_features = ['country', 'store', 'product']
transformer_exog = make_column_transformer(
                       (
                           OrdinalEncoder(
                               dtype=int,
                               handle_unknown="use_encoded_value",
                               unknown_value=-1,
                               encoded_missing_value=-1
                           ),
                           categorical_features
                       ),
                       remainder="passthrough",
                       verbose_feature_names_out=False,
                   ).set_output(transform="pandas")

The encoder will be passed to the forecaster, so it can be used during the prediction phase.

Forecaster training

# Create forecaster
# ==============================================================================
forecaster = ForecasterRecursiveMultiSeries(
                 regressor        = LGBMRegressor(random_state=8520, verbose=-1),
                 lags             = 31,
                 encoding         = "ordinal_category",
                 transformer_exog = transformer_exog,
                 fit_kwargs       = {'categorical_feature': categorical_features}
             )
forecaster

ForecasterRecursiveMultiSeries

General Information

Regressor: LGBMRegressor
Lags: [ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31]
Window features: None
Window size: 31
Series encoding: ordinal_category
Exogenous included: False
Weight function included: False
Series weights: None
Differentiation order: None
Creation date: 2025-05-19 14:35:45
Last fit date: None
Skforecast version: 0.16.0
Python version: 3.12.9
Forecaster id: None

Exogenous Variables

None

Data Transformations

Transformer for series: None
Transformer for exog: ColumnTransformer(remainder='passthrough', transformers=[('ordinalencoder', OrdinalEncoder(dtype=, encoded_missing_value=-1, handle_unknown='use_encoded_value', unknown_value=-1), ['country', 'store', 'product'])], verbose_feature_names_out=False)

Training Information

Series names (levels): None
Training range: None
Training index type: Not fitted
Training index frequency: Not fitted

Regressor Parameters

{'boosting_type': 'gbdt', 'class_weight': None, 'colsample_bytree': 1.0, 'importance_type': 'split', 'learning_rate': 0.1, 'max_depth': -1, 'min_child_samples': 20, 'min_child_weight': 0.001, 'min_split_gain': 0.0, 'n_estimators': 100, 'n_jobs': None, 'num_leaves': 31, 'objective': None, 'random_state': 8520, 'reg_alpha': 0.0, 'reg_lambda': 0.0, 'subsample': 1.0, 'subsample_for_bin': 200000, 'subsample_freq': 0, 'verbose': -1}

Fit Kwargs

{'categorical_feature': ['country', 'store', 'product']}

🛈 API Reference 🗎 User Guide

To find the best model hyperparameters, a Bayesian search is performed using the bayesian_search_forecaster_multiseries function. This method is combined with the OneStepAheadFold validation strategy and uses mean absolute percentage error (MAPE) as the evaluation metric. For more details on the validation strategy, see the Model Evaluation and Tuning section of the documentation.

Since hyperparameter searches should not be performed on the test set, the training data is split into two parts: a training set and a validation set. The training set is used to train the model, and the validation set is used to evaluate its performance.

# Bayesian search with OneStepAheadFold
# ==============================================================================
end_train = '2015-12-31 00:00:00'
start_validation = '2016-01-01 00:00:00'
initial_train_size = (pd.to_datetime(end_train) - pd.to_datetime(data_train['date'].min())).days

def search_space(trial):
    search_space  = {
        'lags'            : trial.suggest_categorical('lags', [1, 14, 21, 60]),
        'n_estimators'    : trial.suggest_int('n_estimators', 200, 800, step=100),
        'max_depth'       : trial.suggest_int('max_depth', 3, 8, step=1),
        'min_data_in_leaf': trial.suggest_int('min_data_in_leaf', 25, 500),
        'learning_rate'   : trial.suggest_float('learning_rate', 0.01, 0.5),
        'feature_fraction': trial.suggest_float('feature_fraction', 0.5, 0.8, step=0.1),
        'max_bin'         : trial.suggest_int('max_bin', 50, 100, step=25),
        'reg_alpha'       : trial.suggest_float('reg_alpha', 0, 1, step=0.1),
        'reg_lambda'      : trial.suggest_float('reg_lambda', 0, 1, step=0.1),
        'linear_tree'     : trial.suggest_categorical('linear_tree', [True, False]),
    }

    return search_space

cv = OneStepAheadFold(initial_train_size=initial_train_size)

results_search, best_trial = bayesian_search_forecaster_multiseries(
    forecaster        = forecaster,
    series            = series_dict,
    exog              = exog_dict,
    cv                = cv,
    search_space      = search_space,
    n_trials          = 20,
    metric            = "mean_absolute_percentage_error",
    suppress_warnings = True
)

best_params = results_search.at[0, 'params']
best_lags = results_search.at[0, 'lags']
results_search.head(3)

  0%|          | 0/20 [00:00<?, ?it/s]

`Forecaster` refitted using the best-found lags and parameters, and the whole data set: 
  Lags: [ 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48
 49 50 51 52 53 54 55 56 57 58 59 60] 
  Parameters: {'n_estimators': 800, 'max_depth': 4, 'min_data_in_leaf': 190, 'learning_rate': 0.21343356904845662, 'feature_fraction': 0.7, 'max_bin': 75, 'reg_alpha': 0.1, 'reg_lambda': 1.0, 'linear_tree': True}
  Backtesting metric: 0.008537774021838387
  Levels: ['Canada_Discount Stickers_Kaggle', 'Canada_Discount Stickers_Kaggle Tiers', 'Canada_Discount Stickers_Kerneler', 'Canada_Discount Stickers_Kerneler Dark Mode', 'Canada_Premium Sticker Mart_Holographic Goose', 'Canada_Premium Sticker Mart_Kaggle', 'Canada_Premium Sticker Mart_Kaggle Tiers', 'Canada_Premium Sticker Mart_Kerneler', 'Canada_Premium Sticker Mart_Kerneler Dark Mode', 'Canada_Stickers for Less_Holographic Goose', '...', 'Singapore_Premium Sticker Mart_Holographic Goose', 'Singapore_Premium Sticker Mart_Kaggle', 'Singapore_Premium Sticker Mart_Kaggle Tiers', 'Singapore_Premium Sticker Mart_Kerneler', 'Singapore_Premium Sticker Mart_Kerneler Dark Mode', 'Singapore_Stickers for Less_Holographic Goose', 'Singapore_Stickers for Less_Kaggle', 'Singapore_Stickers for Less_Kaggle Tiers', 'Singapore_Stickers for Less_Kerneler', 'Singapore_Stickers for Less_Kerneler Dark Mode']

	levels	lags	params	mean_absolute_percentage_error__weighted_average	mean_absolute_percentage_error__average	mean_absolute_percentage_error__pooling	n_estimators	max_depth	min_data_in_leaf	learning_rate	feature_fraction	max_bin	reg_alpha	reg_lambda	linear_tree
0	[Canada_Discount Stickers_Kaggle, Canada_Disco...	[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14...	{'n_estimators': 800, 'max_depth': 4, 'min_dat...	0.008538	0.008490	0.008490	800	4	190	0.213434	0.7	75	0.1	1.0	True
1	[Canada_Discount Stickers_Kaggle, Canada_Disco...	[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14...	{'n_estimators': 400, 'max_depth': 4, 'min_dat...	0.008619	0.008503	0.008503	400	4	158	0.186406	0.6	100	0.0	1.0	True
2	[Canada_Discount Stickers_Kaggle, Canada_Disco...	[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14...	{'n_estimators': 800, 'max_depth': 8, 'min_dat...	0.008635	0.008543	0.008543	800	8	127	0.116468	0.7	100	1.0	0.0	True

Prediction

Once the model is trained, it can be used to make predictions. Note that when training a forecaster using exogenous variables, the exogenous variables must be provided for the prediction period using the exog parameter of the predict method.

Before proceeding to the final test set, it is important to first assess the model’s performance on a validation set, considering that the true sales are not yet available. This intermediate evaluation provides insight into how well the model generalizes and whether it is suitable for final testing.

To conduct this assessment, train the model using all available data up to "2015-12-31 00:00:00" and generate predictions for the subsequent year. Then, these predictions are compared to the actual sales data to evaluate performance.

# Train the forecaster
# ==============================================================================
forecaster.fit(
    series = {k: v.loc[:end_train] for k, v in series_dict.items()},
    exog   = {k: v.loc[:end_train] for k, v in exog_dict.items()},
    suppress_warnings = True,
)

# Predictions for the validation set
# ==============================================================================
steps = (data_train['date'].max() - pd.to_datetime(end_train)).days
print('Number of steps to predict:', steps)

# Select the exogenous variables for the validation dates
exog_dict_validation = {k: v.loc[start_validation:] for k, v in exog_dict.items()}
predictions_validation = forecaster.predict(steps=steps, exog=exog_dict_validation)
predictions_validation.head(4)

Number of steps to predict: 366

╭──────────────────────────────── MissingValuesWarning ────────────────────────────────╮
│ `last_window` has missing values. Most of machine learning models do not allow       │
│ missing values. Prediction method may fail.                                          │
│                                                                                      │
│ Category : MissingValuesWarning                                                      │
│ Location :                                                                           │
│ /home/joaquin/miniconda3/envs/skforecast_16_py12/lib/python3.12/site-packages/skfore │
│ cast/utils/utils.py:989                                                              │
│ Suppress : warnings.simplefilter('ignore', category=MissingValuesWarning)            │
╰──────────────────────────────────────────────────────────────────────────────────────╯

	level	pred
2016-01-01	Canada_Discount Stickers_Kaggle	6.500966
2016-01-01	Canada_Discount Stickers_Kaggle Tiers	6.415851
2016-01-01	Canada_Discount Stickers_Kerneler	5.710596
2016-01-01	Canada_Discount Stickers_Kerneler Dark Mode	5.894394

Since the training was done using the logarithm of the target variable, the predictions are also in logarithmic scale. To obtain the original scale of the data, the inverse transformation is applied using the exponential function.

# Reverse the log transformation of the predictions
# ==============================================================================
predictions_validation['pred'] = np.expm1(predictions_validation['pred'])
predictions_validation.head(4)

	level	pred
2016-01-01	Canada_Discount Stickers_Kaggle	664.784257
2016-01-01	Canada_Discount Stickers_Kaggle Tiers	610.460865
2016-01-01	Canada_Discount Stickers_Kerneler	301.051077
2016-01-01	Canada_Discount Stickers_Kerneler Dark Mode	361.996929

Next, the predictions are compared against the actual sales data to evaluate the model's performance.

# Compare predictions with the real values
# ==============================================================================
predictions_validation = predictions_validation.reset_index(names= 'date')
predictions_validation = predictions_validation.merge(
    data_train[['unique_id', 'date', 'num_sold']],
    left_on  = ['level', 'date'],
    right_on = ['unique_id', 'date'],
    how      = 'left',
    validate = '1:1'
)
predictions_validation = predictions_validation[['date', 'unique_id', 'pred', 'num_sold']]
predictions_validation.head(4)

	date	unique_id	pred	num_sold
0	2016-01-01	Canada_Discount Stickers_Kaggle	664.784257	706.0
1	2016-01-01	Canada_Discount Stickers_Kaggle Tiers	610.460865	634.0
2	2016-01-01	Canada_Discount Stickers_Kerneler	301.051077	316.0
3	2016-01-01	Canada_Discount Stickers_Kerneler Dark Mode	361.996929	404.0

# Calculate MAPE in the validation set
# ==============================================================================
# MAPE do not accept 0 values in the denominator (real values), therefore records
# with 0 in `num_sold` are excluded from the calculation.
mask_not_zero =  predictions_validation['num_sold'] != 0
mask_not_nan = predictions_validation['num_sold'].notna()
mask = mask_not_zero & mask_not_nan

mape_validation = mean_absolute_percentage_error(
    y_true = predictions_validation.loc[mask, 'num_sold'],
    y_pred = predictions_validation.loc[mask, 'pred'],
)
print('Overall MAPE in the validation set :', mape_validation)


# MAPE per time series
# ==============================================================================
mape_validation_per_series = (
    predictions_validation
    .query('num_sold != 0 and num_sold.notna()')
    .groupby('unique_id')
    .apply(lambda group: mean_absolute_percentage_error(
        y_true = group['num_sold'],
        y_pred = group['pred'],
    ), include_groups=False)
    .sort_values()
    .reset_index(name='mape')
)
mape_validation_per_series

Overall MAPE in the validation set : 0.0806041750456532

	unique_id	mape
0	Canada_Discount Stickers_Kaggle	0.043041
1	Norway_Discount Stickers_Kerneler	0.045692
2	Canada_Discount Stickers_Kerneler Dark Mode	0.046368
3	Singapore_Discount Stickers_Kaggle	0.047748
4	Finland_Premium Sticker Mart_Kerneler	0.048505
...	...	...
83	Kenya_Premium Sticker Mart_Kaggle Tiers	0.159288
84	Kenya_Premium Sticker Mart_Holographic Goose	0.166229
85	Norway_Discount Stickers_Holographic Goose	0.193311
86	Singapore_Discount Stickers_Holographic Goose	0.214695
87	Italy_Discount Stickers_Holographic Goose	0.257877

88 rows × 2 columns

Next plot shows the predictions and the actual sales data for four different products.

set_dark_theme()
series_to_plot = [
    'Italy_Stickers for Less_Kerneler Dark Mode',
    'Singapore_Stickers for Less_Holographic Goose',
    'Italy_Discount Stickers_Holographic Goose',
    'Kenya_Premium Sticker Mart_Holographic Goose'
]

for series in series_to_plot:
    fig, ax = plt.subplots(1, 1, figsize=(7, 3))
    predictions_validation.query('unique_id == @series').plot(
        x='date',
        y=['num_sold', 'pred'],
        ax=ax,
        title=series,
        linewidth=0.7,
    )
    plt.show()

Finally, the model is trained using all available data, and the predict method is used to generate predictions for all series over the next three years (1094 days). All predictions are made at once, immediately following the last date of the training data. The model is not updated with new data before making each prediction.

# Train the forecaster with all available data
# ==============================================================================
forecaster.fit(series = series_dict, exog = exog_dict)
forecaster

╭──────────────────────────────── MissingValuesWarning ────────────────────────────────╮
│ NaNs detected in `y_train`. They have been dropped because the target variable       │
│ cannot have NaN values. Same rows have been dropped from `X_train` to maintain       │
│ alignment. This is caused by series with interspersed NaNs.                          │
│                                                                                      │
│ Category : MissingValuesWarning                                                      │
│ Location :                                                                           │
│ /home/joaquin/miniconda3/envs/skforecast_16_py12/lib/python3.12/site-packages/skfore │
│ cast/recursive/_forecaster_recursive_multiseries.py:1191                             │
│ Suppress : warnings.simplefilter('ignore', category=MissingValuesWarning)            │
╰──────────────────────────────────────────────────────────────────────────────────────╯

╭──────────────────────────────── MissingValuesWarning ────────────────────────────────╮
│ NaNs detected in `X_train`. Some regressors do not allow NaN values during training. │
│ If you want to drop them, set `forecaster.dropna_from_series = True`.                │
│                                                                                      │
│ Category : MissingValuesWarning                                                      │
│ Location :                                                                           │
│ /home/joaquin/miniconda3/envs/skforecast_16_py12/lib/python3.12/site-packages/skfore │
│ cast/recursive/_forecaster_recursive_multiseries.py:1213                             │
│ Suppress : warnings.simplefilter('ignore', category=MissingValuesWarning)            │
╰──────────────────────────────────────────────────────────────────────────────────────╯

ForecasterRecursiveMultiSeries

General Information

Regressor: LGBMRegressor
Lags: [ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60]
Window features: None
Window size: 60
Series encoding: ordinal_category
Exogenous included: True
Weight function included: False
Series weights: None
Differentiation order: None
Creation date: 2025-05-19 14:35:45
Last fit date: 2025-05-19 14:38:33
Skforecast version: 0.16.0
Python version: 3.12.9
Forecaster id: None

Exogenous Variables

is_holiday, is_holiday_lag_1, is_holiday_lag_2, is_holiday_lag_5, is_holiday_lag_7, is_holiday_lag_9, is_holiday_next_1, is_holiday_next_2, is_holiday_next_5, month_sin, month_cos, week_sin, week_cos, day_of_week_sin, day_of_week_cos, country, store, product

Data Transformations

Transformer for series: None
Transformer for exog: ColumnTransformer(remainder='passthrough', transformers=[('ordinalencoder', OrdinalEncoder(dtype=, encoded_missing_value=-1, handle_unknown='use_encoded_value', unknown_value=-1), ['country', 'store', 'product'])], verbose_feature_names_out=False)

Training Information

Series names (levels): Canada_Discount Stickers_Kaggle, Canada_Discount Stickers_Kaggle Tiers, Canada_Discount Stickers_Kerneler, Canada_Discount Stickers_Kerneler Dark Mode, Canada_Premium Sticker Mart_Holographic Goose, Canada_Premium Sticker Mart_Kaggle, Canada_Premium Sticker Mart_Kaggle Tiers, Canada_Premium Sticker Mart_Kerneler, Canada_Premium Sticker Mart_Kerneler Dark Mode, Canada_Stickers for Less_Holographic Goose, Canada_Stickers for Less_Kaggle, Canada_Stickers for Less_Kaggle Tiers, Canada_Stickers for Less_Kerneler, Canada_Stickers for Less_Kerneler Dark Mode, Finland_Discount Stickers_Holographic Goose, Finland_Discount Stickers_Kaggle, Finland_Discount Stickers_Kaggle Tiers, Finland_Discount Stickers_Kerneler, Finland_Discount Stickers_Kerneler Dark Mode, Finland_Premium Sticker Mart_Holographic Goose, Finland_Premium Sticker Mart_Kaggle, Finland_Premium Sticker Mart_Kaggle Tiers, Finland_Premium Sticker Mart_Kerneler, Finland_Premium Sticker Mart_Kerneler Dark Mode, Finland_Stickers for Less_Holographic Goose, ..., Norway_Premium Sticker Mart_Holographic Goose, Norway_Premium Sticker Mart_Kaggle, Norway_Premium Sticker Mart_Kaggle Tiers, Norway_Premium Sticker Mart_Kerneler, Norway_Premium Sticker Mart_Kerneler Dark Mode, Norway_Stickers for Less_Holographic Goose, Norway_Stickers for Less_Kaggle, Norway_Stickers for Less_Kaggle Tiers, Norway_Stickers for Less_Kerneler, Norway_Stickers for Less_Kerneler Dark Mode, Singapore_Discount Stickers_Holographic Goose, Singapore_Discount Stickers_Kaggle, Singapore_Discount Stickers_Kaggle Tiers, Singapore_Discount Stickers_Kerneler, Singapore_Discount Stickers_Kerneler Dark Mode, Singapore_Premium Sticker Mart_Holographic Goose, Singapore_Premium Sticker Mart_Kaggle, Singapore_Premium Sticker Mart_Kaggle Tiers, Singapore_Premium Sticker Mart_Kerneler, Singapore_Premium Sticker Mart_Kerneler Dark Mode, Singapore_Stickers for Less_Holographic Goose, Singapore_Stickers for Less_Kaggle, Singapore_Stickers for Less_Kaggle Tiers, Singapore_Stickers for Less_Kerneler, Singapore_Stickers for Less_Kerneler Dark Mode
Training range: 'Canada_Discount Stickers_Kaggle': ['2010-01-01', '2016-12-31'], 'Canada_Discount Stickers_Kaggle Tiers': ['2010-01-01', '2016-12-31'], 'Canada_Discount Stickers_Kerneler': ['2010-01-01', '2016-12-31'], 'Canada_Discount Stickers_Kerneler Dark Mode': ['2010-01-01', '2016-12-31'], 'Canada_Premium Sticker Mart_Holographic Goose': ['2010-01-01', '2016-12-31'], ..., 'Singapore_Stickers for Less_Holographic Goose': ['2010-01-01', '2016-12-31'], 'Singapore_Stickers for Less_Kaggle': ['2010-01-01', '2016-12-31'], 'Singapore_Stickers for Less_Kaggle Tiers': ['2010-01-01', '2016-12-31'], 'Singapore_Stickers for Less_Kerneler': ['2010-01-01', '2016-12-31'], 'Singapore_Stickers for Less_Kerneler Dark Mode': ['2010-01-01', '2016-12-31']
Training index type: DatetimeIndex
Training index frequency: D

Regressor Parameters

{'boosting_type': 'gbdt', 'class_weight': None, 'colsample_bytree': 1.0, 'importance_type': 'split', 'learning_rate': 0.21343356904845662, 'max_depth': 4, 'min_child_samples': 20, 'min_child_weight': 0.001, 'min_split_gain': 0.0, 'n_estimators': 800, 'n_jobs': None, 'num_leaves': 31, 'objective': None, 'random_state': 8520, 'reg_alpha': 0.1, 'reg_lambda': 1.0, 'subsample': 1.0, 'subsample_for_bin': 200000, 'subsample_freq': 0, 'verbose': -1, 'min_data_in_leaf': 190, 'feature_fraction': 0.7, 'max_bin': 75, 'linear_tree': True, 'device': 'cpu'}

Fit Kwargs

{'categorical_feature': ['country', 'store', 'product']}

🛈 API Reference 🗎 User Guide

# Feature importance (top 7)
# ==============================================================================
importance = forecaster.get_feature_importances()
importance.head(7)

	feature	importance
13	lag_14	581
6	lag_7	572
55	lag_56	428
75	week_sin	418
76	week_cos	369
0	lag_1	319
1	lag_2	312

# Prediction of test set
# ==============================================================================
steps = (data_test['date'].max() - data_test['date'].min()).days + 1
print('Number of steps to predict:', steps)
predictions = forecaster.predict(steps=steps, exog=exog_dict_pred, suppress_warnings=True)

# Reverse the log transformation of the predictions
# ==============================================================================
predictions['pred'] = np.expm1(predictions['pred'])
predictions.head(4)

Number of steps to predict: 1095

	level	pred
2017-01-01	Canada_Discount Stickers_Kaggle	938.299735
2017-01-01	Canada_Discount Stickers_Kaggle Tiers	715.476380
2017-01-01	Canada_Discount Stickers_Kerneler	418.295782
2017-01-01	Canada_Discount Stickers_Kerneler Dark Mode	501.718288

Two of the series were excluded from the training set because they contained only missing values. Skforecast allows users to forecast new series that were not seen during the model training; however, the predictions are not included by default. To obtain the predictions for these series, it is needed to use the argument last_window in the predict method.

# Predict unseen series during training
# ==============================================================================
last_window_unseen_series = pd.DataFrame(
    data    = np.nan,
    index   = pd.date_range(end='2016-12-31', periods=forecaster.window_size, freq='D'),
    columns = ['Canada_Discount Stickers_Holographic Goose', 'Kenya_Discount Stickers_Holographic Goose']
)
predictions_unseen_Series = forecaster.predict(
    steps             = steps,
    last_window       = last_window_unseen_series,
    exog              = exog_dict_pred,
    suppress_warnings = True
)
predictions_unseen_Series['pred'] = np.expm1(predictions_unseen_Series['pred'])
predictions_unseen_Series

	level	pred
2017-01-01	Canada_Discount Stickers_Holographic Goose	130.095322
2017-01-01	Kenya_Discount Stickers_Holographic Goose	13.561558
2017-01-02	Canada_Discount Stickers_Holographic Goose	131.262703
2017-01-02	Kenya_Discount Stickers_Holographic Goose	15.972330
2017-01-03	Canada_Discount Stickers_Holographic Goose	163.919315
...	...	...
2019-12-29	Kenya_Discount Stickers_Holographic Goose	102.435982
2019-12-30	Canada_Discount Stickers_Holographic Goose	344.756188
2019-12-30	Kenya_Discount Stickers_Holographic Goose	90.246888
2019-12-31	Canada_Discount Stickers_Holographic Goose	335.771299
2019-12-31	Kenya_Discount Stickers_Holographic Goose	91.815570

2190 rows × 2 columns

Submission results

predictions_all = pd.concat([predictions, predictions_unseen_Series])
submission = data_test.merge(
    predictions_all.reset_index(names=['date']),
    how      = 'left',
    left_on  = ['date', 'unique_id'],
    right_on = ['date', 'level'],
    validate = 'one_to_one'
)

submission = submission.loc[:, ['id', 'pred']]
submission = submission.rename(columns={'pred': 'num_sold'})
submission.to_csv('submission.csv', index=False)
submission

	id	num_sold
0	230130	130.095322
1	230131	938.299735
2	230132	715.476380
3	230133	418.295782
4	230134	501.718288
...	...	...
98545	328675	415.009558
98546	328676	3036.446657
98547	328677	2080.632355
98548	328678	1366.599418
98549	328679	1640.340384

98550 rows × 2 columns

# Update results to kaggle
# ==============================================================================
# !pip install kaggle
# !kaggle competitions submit -c playground-series-s5e1 -f submission.csv -m "uploading submission"

Session information

import session_info
session_info.show(html=False)

-----
feature_engine      1.8.3
holidays            0.72
lightgbm            4.6.0
matplotlib          3.10.1
numpy               2.2.5
optuna              3.6.2
pandas              2.2.3
session_info        v1.0.1
skforecast          0.16.0
sklearn             1.6.1
-----
IPython             9.1.0
jupyter_client      8.6.3
jupyter_core        5.7.2
notebook            6.5.7
-----
Python 3.12.9 | packaged by Anaconda, Inc. | (main, Feb  6 2025, 18:56:27) [GCC 11.2.0]
Linux-6.11.0-25-generic-x86_64-with-glibc2.39
-----
Session information updated at 2025-05-19 14:38

Citation

How to cite this document

If you use this document or any part of it, please acknowledge the source, thank you!

A Step-by-Step Guide to Global Time Series Forecasting Using Kaggle Sticker Sales Data by Joaquín Amat Rodrigo and Javier Escobar Ortiz, available under Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0 DEED) at https://cienciadedatos.net/documentos/py66-forecasting-sticker-sales-kaggle.html

How to cite skforecast

If you use skforecast for a publication, we would appreciate if you cite the published software.

Zenodo:

Amat Rodrigo, Joaquin, & Escobar Ortiz, Javier. (2024). skforecast (v0.16.0). Zenodo. https://doi.org/10.5281/zenodo.8382788

APA:

Amat Rodrigo, J., & Escobar Ortiz, J. (2024). skforecast (Version 0.16.0) [Computer software]. https://doi.org/10.5281/zenodo.8382788

BibTeX:

@software{skforecast, author = {Amat Rodrigo, Joaquin and Escobar Ortiz, Javier}, title = {skforecast}, version = {0.16.0}, month = {05}, year = {2025}, license = {BSD-3-Clause}, url = {https://skforecast.org/}, doi = {10.5281/zenodo.8382788} }

Did you like the article? Your support is important

Your contribution will help me to continue generating free educational content. Many thanks! 😊

This work by Joaquín Amat Rodrigo and Javier Escobar Ortiz is licensed under a Attribution-NonCommercial-ShareAlike 4.0 International.

Allowed:

Share: copy and redistribute the material in any medium or format.
Adapt: remix, transform, and build upon the material.

Under the following terms:

Attribution: You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
NonCommercial: You may not use the material for commercial purposes.
ShareAlike: If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.

A Step-by-Step Guide to Global Time Series Forecasting Using Kaggle Sticker Sales Data

Joaquín Amat Rodrigo, Javier Escobar Ortiz

May, 2025

Introduction

Libraries

Data

Range of available dates

Missing values in target variable

Feature engineering

Logarithmic transformation

Exogenous variables

Modeling

Feature encoding

Forecaster training

ForecasterRecursiveMultiSeries

Prediction

ForecasterRecursiveMultiSeries

Submission results

Session information

Citation