• Introduction
  • Libraries
  • Data
    • Range of available dates
    • Missing values in target variable
  • Feature engineering
    • Logarithmic transformation
    • Exogenous variables
  • Modeling
    • Feature encoding
    • Forecaster training
    • ForecasterRecursiveMultiSeries
    • Prediction
    • ForecasterRecursiveMultiSeries
    • Submission results
  • Session information
  • Citation


More about forecasting in cienciadedatos.net


Introduction

In early 2025, Kaggle launched the Forecasting Sticker Sales competition as part of its Playground series (Season 5, Episode 1), providing a hands-on challenge for time series forecasting enthusiasts.

Contestants were tasked with forecasting monthly sales for five Kaggle-branded products across six countries and three store types - resulting in 90 different time series. The forecasting period spanned from 2017 to 2019, with historical sales data from 2010 to 2016 available for training. Performance was evaluated using Mean Absolute Percentage Error (MAPE), highlighting the importance of accurate predictions across a diverse and moderately granular dataset.

This document serves as an introductory example of how to use the skforecast Python library to build predictive models and generate predictions. It will guide the reader through the main steps of a typical forecasting project, including data preparation, model training, and evaluation. While the focus is on simplicity and clarity, achieving peak performance would require more advanced feature engineering and iterative model refinement.

Libraries

The libraries used in this notebook are:

# Data management
# ==============================================================================
import pandas as pd
import numpy as np
from itertools import product

# Plots
# ==============================================================================
from matplotlib import pyplot as plt
from skforecast.plot import set_dark_theme

# Modelling
# ==============================================================================
from skforecast.recursive import ForecasterRecursiveMultiSeries
from skforecast.model_selection import  bayesian_search_forecaster_multiseries, OneStepAheadFold
from skforecast.preprocessing import series_long_to_dict, exog_long_to_dict
from sklearn.metrics import mean_absolute_percentage_error
from sklearn.compose import make_column_transformer
from sklearn.preprocessing import OrdinalEncoder
from lightgbm import LGBMRegressor
import holidays
from feature_engine.datetime import DatetimeFeatures
from feature_engine.creation import CyclicalFeatures

# Warnings
# ==============================================================================
import warnings
from skforecast.exceptions import MissingValuesWarning
warnings.simplefilter('ignore', category=MissingValuesWarning)

Data

The data sets used in this notebook are from the Kaggle competition: Forecasting Sticker Sales" (Walter Reade and Elizabeth Park, 2025).

# Data
# ==============================================================================
data_train = pd.read_csv('train.csv')
data_test = pd.read_csv('test.csv')
display(data_train.head())
display(data_test.head())
id date country store product num_sold
0 0 2010-01-01 Canada Discount Stickers Holographic Goose NaN
1 1 2010-01-01 Canada Discount Stickers Kaggle 973.0
2 2 2010-01-01 Canada Discount Stickers Kaggle Tiers 906.0
3 3 2010-01-01 Canada Discount Stickers Kerneler 423.0
4 4 2010-01-01 Canada Discount Stickers Kerneler Dark Mode 491.0
id date country store product
0 230130 2017-01-01 Canada Discount Stickers Holographic Goose
1 230131 2017-01-01 Canada Discount Stickers Kaggle
2 230132 2017-01-01 Canada Discount Stickers Kaggle Tiers
3 230133 2017-01-01 Canada Discount Stickers Kerneler
4 230134 2017-01-01 Canada Discount Stickers Kerneler Dark Mode
# Convert 'date' column to datetime
# ==============================================================================
data_train['date'] = pd.to_datetime(data_train['date'])
data_test['date'] = pd.to_datetime(data_test['date'])
# Create a new column 'unique_id' that identifies each time series as the combination
# of columns 'country', 'store', and 'product'
# ==============================================================================
data_train['unique_id'] = (
    data_train['country'] + '_' +
    data_train['store'] + '_' +
    data_train['product']
).replace(' ', '_')
data_test['unique_id'] = (
    data_test['country'] + '_' +
    data_test['store'] + '_' +
    data_test['product']
).replace(' ', '_')

display(data_train.head())
display(data_test.head())
id date country store product num_sold unique_id
0 0 2010-01-01 Canada Discount Stickers Holographic Goose NaN Canada_Discount Stickers_Holographic Goose
1 1 2010-01-01 Canada Discount Stickers Kaggle 973.0 Canada_Discount Stickers_Kaggle
2 2 2010-01-01 Canada Discount Stickers Kaggle Tiers 906.0 Canada_Discount Stickers_Kaggle Tiers
3 3 2010-01-01 Canada Discount Stickers Kerneler 423.0 Canada_Discount Stickers_Kerneler
4 4 2010-01-01 Canada Discount Stickers Kerneler Dark Mode 491.0 Canada_Discount Stickers_Kerneler Dark Mode
id date country store product unique_id
0 230130 2017-01-01 Canada Discount Stickers Holographic Goose Canada_Discount Stickers_Holographic Goose
1 230131 2017-01-01 Canada Discount Stickers Kaggle Canada_Discount Stickers_Kaggle
2 230132 2017-01-01 Canada Discount Stickers Kaggle Tiers Canada_Discount Stickers_Kaggle Tiers
3 230133 2017-01-01 Canada Discount Stickers Kerneler Canada_Discount Stickers_Kerneler
4 230134 2017-01-01 Canada Discount Stickers Kerneler Dark Mode Canada_Discount Stickers_Kerneler Dark Mode

Range of available dates

# Unique conuntries, stores and products
# ==============================================================================
print('Number of unique time series:', data_train['unique_id'].nunique())
print('Unique countries :', data_train['country'].unique())
print('Unique stores    :', data_train['store'].unique())
print('Unique products  :', data_train['product'].unique())
Unique countries: ['Canada' 'Finland' 'Italy' 'Kenya' 'Norway' 'Singapore']
Unique stores: ['Discount Stickers' 'Stickers for Less' 'Premium Sticker Mart']
Unique products: ['Holographic Goose' 'Kaggle' 'Kaggle Tiers' 'Kerneler'
 'Kerneler Dark Mode']
Number of unique time series: 90
# Date range in the training and test sets
# ==============================================================================
print('Date range in the training set :', data_train['date'].min(), 'to', data_train['date'].max())
print('Date range in the test set     :', data_test['date'].min(), 'to', data_test['date'].max())
Date range in the training set: 2010-01-01 00:00:00 to 2016-12-31 00:00:00
Date range in the test set    : 2017-01-01 00:00:00 to 2019-12-31 00:00:00
# Date range available in the training set for each time series
# ==============================================================================
date_range = data_train.groupby('unique_id')['date'].agg(['min', 'max', 'count'])
date_range = date_range.rename(columns={'min': 'start_date', 'max': 'end_date'})
date_range
start_date end_date count
unique_id
Canada_Discount Stickers_Holographic Goose 2010-01-01 2016-12-31 2557
Canada_Discount Stickers_Kaggle 2010-01-01 2016-12-31 2557
Canada_Discount Stickers_Kaggle Tiers 2010-01-01 2016-12-31 2557
Canada_Discount Stickers_Kerneler 2010-01-01 2016-12-31 2557
Canada_Discount Stickers_Kerneler Dark Mode 2010-01-01 2016-12-31 2557
... ... ... ...
Singapore_Stickers for Less_Holographic Goose 2010-01-01 2016-12-31 2557
Singapore_Stickers for Less_Kaggle 2010-01-01 2016-12-31 2557
Singapore_Stickers for Less_Kaggle Tiers 2010-01-01 2016-12-31 2557
Singapore_Stickers for Less_Kerneler 2010-01-01 2016-12-31 2557
Singapore_Stickers for Less_Kerneler Dark Mode 2010-01-01 2016-12-31 2557

90 rows × 3 columns

# Plot 3 time series
# ==============================================================================
set_dark_theme()
series_to_plot = [
    'Italy_Stickers for Less_Kerneler Dark Mode',
    'Singapore_Stickers for Less_Holographic Goose',
    'Italy_Stickers for Less_Kaggle'
]

for series in series_to_plot:
    fig, ax = plt.subplots(1, 1, figsize=(7, 2.5))
    data_train.query('unique_id == @series').plot(
        x='date',
        y=['num_sold'],
        ax=ax,
        title=series,
        linewidth=0.3,
        legend=False,
    )
    plt.show()

Missing values in target variable

Only 9 series have missing values in the target variable num_sold, most of them are in the "Holographic Goose" product. Since no information is given about the missing values in the challenge description, it is necessary to take a decision about how to handle them:

  • Missing values means that the product was not sold in that month, meaning that the sales are 0.

  • Missing values means that the sales of the product are unknown, so they could be 0 or any other value.

# Ensure all time series are complete without intermediate gaps
# ==============================================================================
data_train = (
    data_train
    .groupby('unique_id')
    .apply(lambda group: group.set_index('date').asfreq('D', fill_value=np.nan), include_groups=False)
    .reset_index()
)
data_train
unique_id date id country store product num_sold
0 Canada_Discount Stickers_Holographic Goose 2010-01-01 0 Canada Discount Stickers Holographic Goose NaN
1 Canada_Discount Stickers_Holographic Goose 2010-01-02 90 Canada Discount Stickers Holographic Goose NaN
2 Canada_Discount Stickers_Holographic Goose 2010-01-03 180 Canada Discount Stickers Holographic Goose NaN
3 Canada_Discount Stickers_Holographic Goose 2010-01-04 270 Canada Discount Stickers Holographic Goose NaN
4 Canada_Discount Stickers_Holographic Goose 2010-01-05 360 Canada Discount Stickers Holographic Goose NaN
... ... ... ... ... ... ... ...
230125 Singapore_Stickers for Less_Kerneler Dark Mode 2016-12-27 229764 Singapore Stickers for Less Kerneler Dark Mode 1016.0
230126 Singapore_Stickers for Less_Kerneler Dark Mode 2016-12-28 229854 Singapore Stickers for Less Kerneler Dark Mode 1062.0
230127 Singapore_Stickers for Less_Kerneler Dark Mode 2016-12-29 229944 Singapore Stickers for Less Kerneler Dark Mode 1178.0
230128 Singapore_Stickers for Less_Kerneler Dark Mode 2016-12-30 230034 Singapore Stickers for Less Kerneler Dark Mode 1357.0
230129 Singapore_Stickers for Less_Kerneler Dark Mode 2016-12-31 230124 Singapore Stickers for Less Kerneler Dark Mode 1312.0

230130 rows × 7 columns

# Percentaje of missing values in each series
# ==============================================================================
missing_pct = (
    data_train
    .groupby('unique_id')
    .apply(lambda group: group['num_sold'].isna().mean() * 100, include_groups=False)
    .sort_values(ascending=False)
    .reset_index(name='missing_values_pct')
)
missing_pct.query('missing_values_pct > 0')
unique_id missing_values_pct
0 Canada_Discount Stickers_Holographic Goose 100.000000
1 Kenya_Discount Stickers_Holographic Goose 100.000000
2 Kenya_Stickers for Less_Holographic Goose 53.109112
3 Canada_Stickers for Less_Holographic Goose 51.153696
4 Kenya_Premium Sticker Mart_Holographic Goose 25.263981
5 Canada_Premium Sticker Mart_Holographic Goose 14.861165
6 Kenya_Discount Stickers_Kerneler 2.463825
7 Canada_Discount Stickers_Kerneler 0.039108
8 Kenya_Discount Stickers_Kerneler Dark Mode 0.039108

Series containing only missing values are excluded from the training set, as they provide no useful information. For the remaining series, missing values are not imputed because the chosen regressor, LightGBM, is capable of handling NaN values directly.

# Drop time series with 100% of missing values
# ==============================================================================
series_to_drop = ['Canada_Discount Stickers_Holographic Goose', 'Kenya_Discount Stickers_Holographic Goose']
data_train = data_train.query("unique_id not in @series_to_drop").copy()

✎ Note

As described in the discussion of the competition, one can get better results by filling missing values winth a random number between 1 and the minimum value within the country (Kenya = 5, Canada = 200). However, this aproach seems to be based on repited submissions to the competition rather than a solid statistical foundation. The approach taken in this notebook is to leave the missing values as they are, and let the model learn from them. This is a common practice in time series forecasting, as it allows the model to learn from the patterns in the data without introducing artificial values. Nevertheless, the reader can try it with the following code:
# Fill series "Canada_Discount Stickers_Holographic Goose" with random values between 1 and 200
# ==============================================================================
# mask = data_train['unique_id'] == 'Canada_Discount Stickers_Holographic Goose'
# data_train.loc[mask, 'num_sold'] = np.random.randint(1, 200, size=sum(mask))

# Fill series "Kenya_Discount Stickers_Holographic Goose" with random values between 1 and 5
# ==============================================================================
# mask = data_train['unique_id'] == 'Kenya_Discount Stickers_Holographic Goose'
# data_train.loc[mask, 'num_sold'] = np.random.randint(1, 5, size=sum(mask))

Feature engineering

Logarithmic transformation

Series are transformed using the logarithm function. This transformation is particularly useful for series with a long tail, as it reduces the impact of extreme values and helps to normalize the distribution of the data. Furthtermore, it avoid negative predictions, which is a common issue when using machine learning models.

The logarithmic transformation is applied to the target variable num_sold and the resulting values are stored in a new column called log_num_sold. Once the predictions are made, the inverse transformation is applied to obtain the original scale of the data.

# Transform the target variable 'num_sold' to log scale
# ==============================================================================
data_train['log_num_sold'] = np.log1p(data_train['num_sold'])
data_train.head()
unique_id date id country store product num_sold log_num_sold
2557 Canada_Discount Stickers_Kaggle 2010-01-01 1 Canada Discount Stickers Kaggle 973.0 6.881411
2558 Canada_Discount Stickers_Kaggle 2010-01-02 91 Canada Discount Stickers Kaggle 881.0 6.782192
2559 Canada_Discount Stickers_Kaggle 2010-01-03 181 Canada Discount Stickers Kaggle 1003.0 6.911747
2560 Canada_Discount Stickers_Kaggle 2010-01-04 271 Canada Discount Stickers Kaggle 744.0 6.613384
2561 Canada_Discount Stickers_Kaggle 2010-01-05 361 Canada Discount Stickers Kaggle 707.0 6.562444

Exogenous variables

In addition to historical sales data, incorporating additional variables—also known as exogenous variables—can enhance the model's performance. For example, calendar-related variables such as the month, day of the week, and holidays can provide valuable context.

# Generate holidays for each country and year
# ==============================================================================
countries = ['Canada', 'Finland', 'Italy', 'Kenya', 'Norway', 'Singapore']
years = [2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019]
all_holidays = []

for country_code, year in product(countries, years):
    try:
        country_holidays = holidays.country_holidays(country_code, years=year)
        all_holidays.extend([
            {
                'country': country_code,
                'date': pd.to_datetime(date),
                'holiday_name': name
            }
            for date, name in country_holidays.items()
        ])
    except NotImplementedError:
        print(f"Country '{country_code}' is not supported by the holidays library.")

df_holidays = (
    pd.DataFrame(all_holidays)
    .groupby(['country', 'date'], as_index=False)
    .agg({'holiday_name': ', '.join})
)
df_holidays['is_holiday'] = 1
df_holidays.sort_values(by=['country', 'date'], inplace=True)

To account for the delayed impact of holidays on sales, it's recommended to include lagged versions of the holiday variable. This involves creating new indicators that show whether the previous or following day is a holiday.

Since the df_holidays DataFrame only lists holiday dates, it must first be expanded to include all calendar dates for each country before calculating these shifted variables.

# Build complete date range per country
date_range = pd.date_range(start=f"{min(years)}-01-01", end=f"{max(years)}-12-31")
all_dates = pd.MultiIndex.from_product([countries, date_range], names=['country', 'date']).to_frame(index=False)

# Merge with holidays
df_calendar = all_dates.merge(df_holidays, on=['country', 'date'], how='left')
df_calendar['is_holiday'] = df_calendar['is_holiday'].fillna(0).astype(int)

# Create lagged and next features for holidays
for i in [1, 2, 5, 7, 9]:
    df_calendar[f'is_holiday_lag_{i}'] = df_calendar.groupby('country')['is_holiday'].shift(i).fillna(0).astype(int)
    df_calendar[f'is_holiday_next_{i}'] = df_calendar.groupby('country')['is_holiday'].shift(-i).fillna(0).astype(int)
    
df_calendar.head()
country date holiday_name is_holiday is_holiday_lag_1 is_holiday_next_1 is_holiday_lag_2 is_holiday_next_2 is_holiday_lag_5 is_holiday_next_5 is_holiday_lag_7 is_holiday_next_7 is_holiday_lag_9 is_holiday_next_9
0 Canada 2010-01-01 New Year's Day 1 0 0 0 0 0 0 0 0 0 0
1 Canada 2010-01-02 NaN 0 1 0 0 0 0 0 0 0 0 0
2 Canada 2010-01-03 NaN 0 0 0 1 0 0 0 0 0 0 0
3 Canada 2010-01-04 NaN 0 0 0 0 0 0 0 0 0 0 0
4 Canada 2010-01-05 NaN 0 0 0 0 0 0 0 0 0 0 0
# Add calendar features and encode them with cyclical encoding
# ==============================================================================
features_to_extract = [
    'month',
    'week',
    'day_of_week',
]
calendar_transformer = DatetimeFeatures(
                           variables           = 'date',
                           features_to_extract = features_to_extract,
                           drop_original       = False,
                       )
df_calendar = calendar_transformer.fit_transform(df_calendar)
df_calendar.columns = df_calendar.columns.str.replace('date_', '', regex=False)

features_to_encode = [
    "month",
    "week",
    "day_of_week",
]
max_values = {
    "month": 12,
    "week": 52,
    "day_of_week": 6,
}
cyclical_encoder = CyclicalFeatures(
                       variables     = features_to_encode,
                       max_values    = max_values,
                       drop_original = True
                   )
df_calendar = cyclical_encoder.fit_transform(df_calendar)
df_calendar.head()
country date holiday_name is_holiday is_holiday_lag_1 is_holiday_next_1 is_holiday_lag_2 is_holiday_next_2 is_holiday_lag_5 is_holiday_next_5 is_holiday_lag_7 is_holiday_next_7 is_holiday_lag_9 is_holiday_next_9 month_sin month_cos week_sin week_cos day_of_week_sin day_of_week_cos
0 Canada 2010-01-01 New Year's Day 1 0 0 0 0 0 0 0 0 0 0 0.5 0.866025 0.120537 0.992709 -8.660254e-01 -0.5
1 Canada 2010-01-02 NaN 0 1 0 0 0 0 0 0 0 0 0 0.5 0.866025 0.120537 0.992709 -8.660254e-01 0.5
2 Canada 2010-01-03 NaN 0 0 0 1 0 0 0 0 0 0 0 0.5 0.866025 0.120537 0.992709 -2.449294e-16 1.0
3 Canada 2010-01-04 NaN 0 0 0 0 0 0 0 0 0 0 0 0.5 0.866025 0.120537 0.992709 0.000000e+00 1.0
4 Canada 2010-01-05 NaN 0 0 0 0 0 0 0 0 0 0 0 0.5 0.866025 0.120537 0.992709 8.660254e-01 0.5
# Add all exogenous features to the training set
# ==============================================================================
exog_features = [
    'is_holiday',
    'is_holiday_lag_1',
    'is_holiday_lag_2',
    'is_holiday_lag_5',
    'is_holiday_lag_7',
    'is_holiday_lag_9',
    'is_holiday_next_1',
    'is_holiday_next_2',
    'is_holiday_next_5',
    'month_sin',
    'month_cos',
    'week_sin',
    'week_cos',
    'day_of_week_sin',
    'day_of_week_cos',
    'country',
    'store',
    'product',
]

data_train = data_train.merge(
    right    = df_calendar.drop(columns=['holiday_name']),
    how      = 'left',
    left_on  = ['country', 'date'],
    right_on = ['country', 'date'],
    validate = 'many_to_one'
)
data_train.head()
unique_id date id country store product num_sold log_num_sold is_holiday is_holiday_lag_1 ... is_holiday_lag_7 is_holiday_next_7 is_holiday_lag_9 is_holiday_next_9 month_sin month_cos week_sin week_cos day_of_week_sin day_of_week_cos
0 Canada_Discount Stickers_Kaggle 2010-01-01 1 Canada Discount Stickers Kaggle 973.0 6.881411 1 0 ... 0 0 0 0 0.5 0.866025 0.120537 0.992709 -8.660254e-01 -0.5
1 Canada_Discount Stickers_Kaggle 2010-01-02 91 Canada Discount Stickers Kaggle 881.0 6.782192 0 1 ... 0 0 0 0 0.5 0.866025 0.120537 0.992709 -8.660254e-01 0.5
2 Canada_Discount Stickers_Kaggle 2010-01-03 181 Canada Discount Stickers Kaggle 1003.0 6.911747 0 0 ... 0 0 0 0 0.5 0.866025 0.120537 0.992709 -2.449294e-16 1.0
3 Canada_Discount Stickers_Kaggle 2010-01-04 271 Canada Discount Stickers Kaggle 744.0 6.613384 0 0 ... 0 0 0 0 0.5 0.866025 0.120537 0.992709 0.000000e+00 1.0
4 Canada_Discount Stickers_Kaggle 2010-01-05 361 Canada Discount Stickers Kaggle 707.0 6.562444 0 0 ... 0 0 0 0 0.5 0.866025 0.120537 0.992709 8.660254e-01 0.5

5 rows × 25 columns

Modeling

Since there are 90 different time series, two different approaches can be taken: modeling each time series separately, known as local forecasting, or modeling all time series together, known as global forecasting.

In this case, a global forecasting approach is taken, which typically better fulfills the requirements of real-world applications, where a large number of time series exist and training a single model for each series is computationally infeasible. Furthermore, the global forecasting models implemented in skforecast are able to forecast new time series that were not present in the training data, which will be useful for predicting the "Holographic Goose" product.

Skforecast accepts different data structures when creating global forecasting models (ForecasterRecursiveMultiSeries):

  • If all series have the same length and share the same exogenous variables, the data can be passed as a pandas DataFrame where each column represents a time series and each row corresponds to a time step. The DataFrame index should be a datetime index.

  • If the series have different lengths, the data must be passed as a dictionary. The keys of the dictionary represent the names of the series, and the values are the series themselves. To facilitate this, the series_long_to_dict function can be used—it takes a DataFrame in "long format" and returns a dictionary of pandas Series. Similarly, if the exogenous variables differ (in values or type) across series, the data must also be provided as a dictionary. In this case, the exog_long_to_dict function is used, converting a "long format" DataFrame into a dictionary of exogenous variables (either pandas Series or pandas DataFrames).

In this scenario, the holiday-related exogenous variables differ for each series, as they are specific to the country where the product is sold. Therefore, the data must be passed as a dictionary.

# Transform series and exog to dictionaries
# ==============================================================================
series_dict = series_long_to_dict(
    data      = data_train,
    series_id = 'unique_id',
    index     = 'date',
    values    = 'log_num_sold',
    freq      = 'D'
)

exog_dict = exog_long_to_dict(
    data      = data_train[exog_features + ['date', 'unique_id']],
    series_id = 'unique_id',
    index     = 'date',
    freq      = 'D'
)

When training a forecaster using exogenous variables, it is necessary to provide the exogenous variables for the prediction period. These variables must follow the same structure observed during training. Therefore, the exogenous variables for the test set must also be provided as a dictionary.

# Prepare exogenous variables for the test set
# ==============================================================================
data_test = data_test.merge(
    df_calendar.drop(columns=['holiday_name']),
    how      = 'left',
    left_on  = ['country', 'date'],
    right_on = ['country', 'date'],
    validate = 'many_to_one'
)

exog_dict_pred = exog_long_to_dict(
    data      = data_test[exog_features + ['date', 'unique_id']],
    series_id = 'unique_id',
    index     = 'date',
    freq      = 'D'
)

Feature encoding

The exogenous variables country, store, and product are categorical. Depending on the regressor used, it may be necessary to encode them. In this case, the LightGBM regressor can handle categorical variables directly. However, to ensure they are treated consistently in the training and prediction phases, the variables are encoded first encoded as integers and then stored as Pandas category type. For more details on how to encode exogenous variables, please refer to the Feature Engineering section of the user guide.

# Categorical encoding
# ==============================================================================
# A ColumnTransformer is used to transform categorical (not numerical) features
# using ordinal encoding. Numeric features are left untouched. Missing values
# are coded as -1. If a new category is found in the test set, it is encoded
# as -1.
categorical_features = ['country', 'store', 'product']
transformer_exog = make_column_transformer(
                       (
                           OrdinalEncoder(
                               dtype=int,
                               handle_unknown="use_encoded_value",
                               unknown_value=-1,
                               encoded_missing_value=-1
                           ),
                           categorical_features
                       ),
                       remainder="passthrough",
                       verbose_feature_names_out=False,
                   ).set_output(transform="pandas")

The encoder will be passed to the forecaster, so it can be used during the prediction phase.

Forecaster training

# Create forecaster
# ==============================================================================
forecaster = ForecasterRecursiveMultiSeries(
                 regressor        = LGBMRegressor(random_state=8520, verbose=-1),
                 lags             = 31,
                 encoding         = "ordinal_category",
                 transformer_exog = transformer_exog,
                 fit_kwargs       = {'categorical_feature': categorical_features}
             )
forecaster

ForecasterRecursiveMultiSeries

General Information
  • Regressor: LGBMRegressor
  • Lags: [ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31]
  • Window features: None
  • Window size: 31
  • Series encoding: ordinal_category
  • Exogenous included: False
  • Weight function included: False
  • Series weights: None
  • Differentiation order: None
  • Creation date: 2025-05-19 14:35:45
  • Last fit date: None
  • Skforecast version: 0.16.0
  • Python version: 3.12.9
  • Forecaster id: None
Exogenous Variables
    None
Data Transformations
  • Transformer for series: None
  • Transformer for exog: ColumnTransformer(remainder='passthrough', transformers=[('ordinalencoder', OrdinalEncoder(dtype=, encoded_missing_value=-1, handle_unknown='use_encoded_value', unknown_value=-1), ['country', 'store', 'product'])], verbose_feature_names_out=False)
Training Information
  • Series names (levels): None
  • Training range: None
  • Training index type: Not fitted
  • Training index frequency: Not fitted
Regressor Parameters
    {'boosting_type': 'gbdt', 'class_weight': None, 'colsample_bytree': 1.0, 'importance_type': 'split', 'learning_rate': 0.1, 'max_depth': -1, 'min_child_samples': 20, 'min_child_weight': 0.001, 'min_split_gain': 0.0, 'n_estimators': 100, 'n_jobs': None, 'num_leaves': 31, 'objective': None, 'random_state': 8520, 'reg_alpha': 0.0, 'reg_lambda': 0.0, 'subsample': 1.0, 'subsample_for_bin': 200000, 'subsample_freq': 0, 'verbose': -1}
Fit Kwargs
    {'categorical_feature': ['country', 'store', 'product']}

🛈 API Reference    🗎 User Guide

To find the best model hyperparameters, a Bayesian search is performed using the bayesian_search_forecaster_multiseries function. This method is combined with the OneStepAheadFold validation strategy and uses mean absolute percentage error (MAPE) as the evaluation metric. For more details on the validation strategy, see the Model Evaluation and Tuning section of the documentation.

Since hyperparameter searches should not be performed on the test set, the training data is split into two parts: a training set and a validation set. The training set is used to train the model, and the validation set is used to evaluate its performance.

# Bayesian search with OneStepAheadFold
# ==============================================================================
end_train = '2015-12-31 00:00:00'
start_validation = '2016-01-01 00:00:00'
initial_train_size = (pd.to_datetime(end_train) - pd.to_datetime(data_train['date'].min())).days

def search_space(trial):
    search_space  = {
        'lags'            : trial.suggest_categorical('lags', [1, 14, 21, 60]),
        'n_estimators'    : trial.suggest_int('n_estimators', 200, 800, step=100),
        'max_depth'       : trial.suggest_int('max_depth', 3, 8, step=1),
        'min_data_in_leaf': trial.suggest_int('min_data_in_leaf', 25, 500),
        'learning_rate'   : trial.suggest_float('learning_rate', 0.01, 0.5),
        'feature_fraction': trial.suggest_float('feature_fraction', 0.5, 0.8, step=0.1),
        'max_bin'         : trial.suggest_int('max_bin', 50, 100, step=25),
        'reg_alpha'       : trial.suggest_float('reg_alpha', 0, 1, step=0.1),
        'reg_lambda'      : trial.suggest_float('reg_lambda', 0, 1, step=0.1),
        'linear_tree'     : trial.suggest_categorical('linear_tree', [True, False]),
    }

    return search_space

cv = OneStepAheadFold(initial_train_size=initial_train_size)

results_search, best_trial = bayesian_search_forecaster_multiseries(
    forecaster        = forecaster,
    series            = series_dict,
    exog              = exog_dict,
    cv                = cv,
    search_space      = search_space,
    n_trials          = 20,
    metric            = "mean_absolute_percentage_error",
    suppress_warnings = True
)

best_params = results_search.at[0, 'params']
best_lags = results_search.at[0, 'lags']
results_search.head(3)
  0%|          | 0/20 [00:00<?, ?it/s]
`Forecaster` refitted using the best-found lags and parameters, and the whole data set: 
  Lags: [ 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48
 49 50 51 52 53 54 55 56 57 58 59 60] 
  Parameters: {'n_estimators': 800, 'max_depth': 4, 'min_data_in_leaf': 190, 'learning_rate': 0.21343356904845662, 'feature_fraction': 0.7, 'max_bin': 75, 'reg_alpha': 0.1, 'reg_lambda': 1.0, 'linear_tree': True}
  Backtesting metric: 0.008537774021838387
  Levels: ['Canada_Discount Stickers_Kaggle', 'Canada_Discount Stickers_Kaggle Tiers', 'Canada_Discount Stickers_Kerneler', 'Canada_Discount Stickers_Kerneler Dark Mode', 'Canada_Premium Sticker Mart_Holographic Goose', 'Canada_Premium Sticker Mart_Kaggle', 'Canada_Premium Sticker Mart_Kaggle Tiers', 'Canada_Premium Sticker Mart_Kerneler', 'Canada_Premium Sticker Mart_Kerneler Dark Mode', 'Canada_Stickers for Less_Holographic Goose', '...', 'Singapore_Premium Sticker Mart_Holographic Goose', 'Singapore_Premium Sticker Mart_Kaggle', 'Singapore_Premium Sticker Mart_Kaggle Tiers', 'Singapore_Premium Sticker Mart_Kerneler', 'Singapore_Premium Sticker Mart_Kerneler Dark Mode', 'Singapore_Stickers for Less_Holographic Goose', 'Singapore_Stickers for Less_Kaggle', 'Singapore_Stickers for Less_Kaggle Tiers', 'Singapore_Stickers for Less_Kerneler', 'Singapore_Stickers for Less_Kerneler Dark Mode']

levels lags params mean_absolute_percentage_error__weighted_average mean_absolute_percentage_error__average mean_absolute_percentage_error__pooling n_estimators max_depth min_data_in_leaf learning_rate feature_fraction max_bin reg_alpha reg_lambda linear_tree
0 [Canada_Discount Stickers_Kaggle, Canada_Disco... [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14... {'n_estimators': 800, 'max_depth': 4, 'min_dat... 0.008538 0.008490 0.008490 800 4 190 0.213434 0.7 75 0.1 1.0 True
1 [Canada_Discount Stickers_Kaggle, Canada_Disco... [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14... {'n_estimators': 400, 'max_depth': 4, 'min_dat... 0.008619 0.008503 0.008503 400 4 158 0.186406 0.6 100 0.0 1.0 True
2 [Canada_Discount Stickers_Kaggle, Canada_Disco... [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14... {'n_estimators': 800, 'max_depth': 8, 'min_dat... 0.008635 0.008543 0.008543 800 8 127 0.116468 0.7 100 1.0 0.0 True

Prediction

Once the model is trained, it can be used to make predictions. Note that when training a forecaster using exogenous variables, the exogenous variables must be provided for the prediction period using the exog parameter of the predict method.

Before proceeding to the final test set, it is important to first assess the model’s performance on a validation set, considering that the true sales are not yet available. This intermediate evaluation provides insight into how well the model generalizes and whether it is suitable for final testing.

To conduct this assessment, train the model using all available data up to "2015-12-31 00:00:00" and generate predictions for the subsequent year. Then, these predictions are compared to the actual sales data to evaluate performance.

# Train the forecaster
# ==============================================================================
forecaster.fit(
    series = {k: v.loc[:end_train] for k, v in series_dict.items()},
    exog   = {k: v.loc[:end_train] for k, v in exog_dict.items()},
    suppress_warnings = True,
)
# Predictions for the validation set
# ==============================================================================
steps = (data_train['date'].max() - pd.to_datetime(end_train)).days
print('Number of steps to predict:', steps)

# Select the exogenous variables for the validation dates
exog_dict_validation = {k: v.loc[start_validation:] for k, v in exog_dict.items()}
predictions_validation = forecaster.predict(steps=steps, exog=exog_dict_validation)
predictions_validation.head(4)
Number of steps to predict: 366
╭──────────────────────────────── MissingValuesWarning ────────────────────────────────╮
 `last_window` has missing values. Most of machine learning models do not allow       
 missing values. Prediction method may fail.                                          
                                                                                      
 Category : MissingValuesWarning                                                      
 Location :                                                                           
 /home/joaquin/miniconda3/envs/skforecast_16_py12/lib/python3.12/site-packages/skfore 
 cast/utils/utils.py:989                                                              
 Suppress : warnings.simplefilter('ignore', category=MissingValuesWarning)            
╰──────────────────────────────────────────────────────────────────────────────────────╯
level pred
2016-01-01 Canada_Discount Stickers_Kaggle 6.500966
2016-01-01 Canada_Discount Stickers_Kaggle Tiers 6.415851
2016-01-01 Canada_Discount Stickers_Kerneler 5.710596
2016-01-01 Canada_Discount Stickers_Kerneler Dark Mode 5.894394

Since the training was done using the logarithm of the target variable, the predictions are also in logarithmic scale. To obtain the original scale of the data, the inverse transformation is applied using the exponential function.

# Reverse the log transformation of the predictions
# ==============================================================================
predictions_validation['pred'] = np.expm1(predictions_validation['pred'])
predictions_validation.head(4)
level pred
2016-01-01 Canada_Discount Stickers_Kaggle 664.784257
2016-01-01 Canada_Discount Stickers_Kaggle Tiers 610.460865
2016-01-01 Canada_Discount Stickers_Kerneler 301.051077
2016-01-01 Canada_Discount Stickers_Kerneler Dark Mode 361.996929

Next, the predictions are compared against the actual sales data to evaluate the model's performance.

# Compare predictions with the real values
# ==============================================================================
predictions_validation = predictions_validation.reset_index(names= 'date')
predictions_validation = predictions_validation.merge(
    data_train[['unique_id', 'date', 'num_sold']],
    left_on  = ['level', 'date'],
    right_on = ['unique_id', 'date'],
    how      = 'left',
    validate = '1:1'
)
predictions_validation = predictions_validation[['date', 'unique_id', 'pred', 'num_sold']]
predictions_validation.head(4)
date unique_id pred num_sold
0 2016-01-01 Canada_Discount Stickers_Kaggle 664.784257 706.0
1 2016-01-01 Canada_Discount Stickers_Kaggle Tiers 610.460865 634.0
2 2016-01-01 Canada_Discount Stickers_Kerneler 301.051077 316.0
3 2016-01-01 Canada_Discount Stickers_Kerneler Dark Mode 361.996929 404.0
# Calculate MAPE in the validation set
# ==============================================================================
# MAPE do not accept 0 values in the denominator (real values), therefore records
# with 0 in `num_sold` are excluded from the calculation.
mask_not_zero =  predictions_validation['num_sold'] != 0
mask_not_nan = predictions_validation['num_sold'].notna()
mask = mask_not_zero & mask_not_nan

mape_validation = mean_absolute_percentage_error(
    y_true = predictions_validation.loc[mask, 'num_sold'],
    y_pred = predictions_validation.loc[mask, 'pred'],
)
print('Overall MAPE in the validation set :', mape_validation)


# MAPE per time series
# ==============================================================================
mape_validation_per_series = (
    predictions_validation
    .query('num_sold != 0 and num_sold.notna()')
    .groupby('unique_id')
    .apply(lambda group: mean_absolute_percentage_error(
        y_true = group['num_sold'],
        y_pred = group['pred'],
    ), include_groups=False)
    .sort_values()
    .reset_index(name='mape')
)
mape_validation_per_series
Overall MAPE in the validation set : 0.0806041750456532
unique_id mape
0 Canada_Discount Stickers_Kaggle 0.043041
1 Norway_Discount Stickers_Kerneler 0.045692
2 Canada_Discount Stickers_Kerneler Dark Mode 0.046368
3 Singapore_Discount Stickers_Kaggle 0.047748
4 Finland_Premium Sticker Mart_Kerneler 0.048505
... ... ...
83 Kenya_Premium Sticker Mart_Kaggle Tiers 0.159288
84 Kenya_Premium Sticker Mart_Holographic Goose 0.166229
85 Norway_Discount Stickers_Holographic Goose 0.193311
86 Singapore_Discount Stickers_Holographic Goose 0.214695
87 Italy_Discount Stickers_Holographic Goose 0.257877

88 rows × 2 columns

Next plot shows the predictions and the actual sales data for four different products.

set_dark_theme()
series_to_plot = [
    'Italy_Stickers for Less_Kerneler Dark Mode',
    'Singapore_Stickers for Less_Holographic Goose',
    'Italy_Discount Stickers_Holographic Goose',
    'Kenya_Premium Sticker Mart_Holographic Goose'
]

for series in series_to_plot:
    fig, ax = plt.subplots(1, 1, figsize=(7, 3))
    predictions_validation.query('unique_id == @series').plot(
        x='date',
        y=['num_sold', 'pred'],
        ax=ax,
        title=series,
        linewidth=0.7,
    )
    plt.show()

Finally, the model is trained using all available data, and the predict method is used to generate predictions for all series over the next three years (1094 days). All predictions are made at once, immediately following the last date of the training data. The model is not updated with new data before making each prediction.

# Train the forecaster with all available data
# ==============================================================================
forecaster.fit(series = series_dict, exog = exog_dict)
forecaster
╭──────────────────────────────── MissingValuesWarning ────────────────────────────────╮
 NaNs detected in `y_train`. They have been dropped because the target variable       
 cannot have NaN values. Same rows have been dropped from `X_train` to maintain       
 alignment. This is caused by series with interspersed NaNs.                          
                                                                                      
 Category : MissingValuesWarning                                                      
 Location :                                                                           
 /home/joaquin/miniconda3/envs/skforecast_16_py12/lib/python3.12/site-packages/skfore 
 cast/recursive/_forecaster_recursive_multiseries.py:1191                             
 Suppress : warnings.simplefilter('ignore', category=MissingValuesWarning)            
╰──────────────────────────────────────────────────────────────────────────────────────╯
╭──────────────────────────────── MissingValuesWarning ────────────────────────────────╮
 NaNs detected in `X_train`. Some regressors do not allow NaN values during training. 
 If you want to drop them, set `forecaster.dropna_from_series = True`.                
                                                                                      
 Category : MissingValuesWarning                                                      
 Location :                                                                           
 /home/joaquin/miniconda3/envs/skforecast_16_py12/lib/python3.12/site-packages/skfore 
 cast/recursive/_forecaster_recursive_multiseries.py:1213                             
 Suppress : warnings.simplefilter('ignore', category=MissingValuesWarning)            
╰──────────────────────────────────────────────────────────────────────────────────────╯

ForecasterRecursiveMultiSeries

General Information
  • Regressor: LGBMRegressor
  • Lags: [ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60]
  • Window features: None
  • Window size: 60
  • Series encoding: ordinal_category
  • Exogenous included: True
  • Weight function included: False
  • Series weights: None
  • Differentiation order: None
  • Creation date: 2025-05-19 14:35:45
  • Last fit date: 2025-05-19 14:38:33
  • Skforecast version: 0.16.0
  • Python version: 3.12.9
  • Forecaster id: None
Exogenous Variables
    is_holiday, is_holiday_lag_1, is_holiday_lag_2, is_holiday_lag_5, is_holiday_lag_7, is_holiday_lag_9, is_holiday_next_1, is_holiday_next_2, is_holiday_next_5, month_sin, month_cos, week_sin, week_cos, day_of_week_sin, day_of_week_cos, country, store, product
Data Transformations
  • Transformer for series: None
  • Transformer for exog: ColumnTransformer(remainder='passthrough', transformers=[('ordinalencoder', OrdinalEncoder(dtype=, encoded_missing_value=-1, handle_unknown='use_encoded_value', unknown_value=-1), ['country', 'store', 'product'])], verbose_feature_names_out=False)
Training Information
  • Series names (levels): Canada_Discount Stickers_Kaggle, Canada_Discount Stickers_Kaggle Tiers, Canada_Discount Stickers_Kerneler, Canada_Discount Stickers_Kerneler Dark Mode, Canada_Premium Sticker Mart_Holographic Goose, Canada_Premium Sticker Mart_Kaggle, Canada_Premium Sticker Mart_Kaggle Tiers, Canada_Premium Sticker Mart_Kerneler, Canada_Premium Sticker Mart_Kerneler Dark Mode, Canada_Stickers for Less_Holographic Goose, Canada_Stickers for Less_Kaggle, Canada_Stickers for Less_Kaggle Tiers, Canada_Stickers for Less_Kerneler, Canada_Stickers for Less_Kerneler Dark Mode, Finland_Discount Stickers_Holographic Goose, Finland_Discount Stickers_Kaggle, Finland_Discount Stickers_Kaggle Tiers, Finland_Discount Stickers_Kerneler, Finland_Discount Stickers_Kerneler Dark Mode, Finland_Premium Sticker Mart_Holographic Goose, Finland_Premium Sticker Mart_Kaggle, Finland_Premium Sticker Mart_Kaggle Tiers, Finland_Premium Sticker Mart_Kerneler, Finland_Premium Sticker Mart_Kerneler Dark Mode, Finland_Stickers for Less_Holographic Goose, ..., Norway_Premium Sticker Mart_Holographic Goose, Norway_Premium Sticker Mart_Kaggle, Norway_Premium Sticker Mart_Kaggle Tiers, Norway_Premium Sticker Mart_Kerneler, Norway_Premium Sticker Mart_Kerneler Dark Mode, Norway_Stickers for Less_Holographic Goose, Norway_Stickers for Less_Kaggle, Norway_Stickers for Less_Kaggle Tiers, Norway_Stickers for Less_Kerneler, Norway_Stickers for Less_Kerneler Dark Mode, Singapore_Discount Stickers_Holographic Goose, Singapore_Discount Stickers_Kaggle, Singapore_Discount Stickers_Kaggle Tiers, Singapore_Discount Stickers_Kerneler, Singapore_Discount Stickers_Kerneler Dark Mode, Singapore_Premium Sticker Mart_Holographic Goose, Singapore_Premium Sticker Mart_Kaggle, Singapore_Premium Sticker Mart_Kaggle Tiers, Singapore_Premium Sticker Mart_Kerneler, Singapore_Premium Sticker Mart_Kerneler Dark Mode, Singapore_Stickers for Less_Holographic Goose, Singapore_Stickers for Less_Kaggle, Singapore_Stickers for Less_Kaggle Tiers, Singapore_Stickers for Less_Kerneler, Singapore_Stickers for Less_Kerneler Dark Mode
  • Training range: 'Canada_Discount Stickers_Kaggle': ['2010-01-01', '2016-12-31'], 'Canada_Discount Stickers_Kaggle Tiers': ['2010-01-01', '2016-12-31'], 'Canada_Discount Stickers_Kerneler': ['2010-01-01', '2016-12-31'], 'Canada_Discount Stickers_Kerneler Dark Mode': ['2010-01-01', '2016-12-31'], 'Canada_Premium Sticker Mart_Holographic Goose': ['2010-01-01', '2016-12-31'], ..., 'Singapore_Stickers for Less_Holographic Goose': ['2010-01-01', '2016-12-31'], 'Singapore_Stickers for Less_Kaggle': ['2010-01-01', '2016-12-31'], 'Singapore_Stickers for Less_Kaggle Tiers': ['2010-01-01', '2016-12-31'], 'Singapore_Stickers for Less_Kerneler': ['2010-01-01', '2016-12-31'], 'Singapore_Stickers for Less_Kerneler Dark Mode': ['2010-01-01', '2016-12-31']
  • Training index type: DatetimeIndex
  • Training index frequency: D
Regressor Parameters
    {'boosting_type': 'gbdt', 'class_weight': None, 'colsample_bytree': 1.0, 'importance_type': 'split', 'learning_rate': 0.21343356904845662, 'max_depth': 4, 'min_child_samples': 20, 'min_child_weight': 0.001, 'min_split_gain': 0.0, 'n_estimators': 800, 'n_jobs': None, 'num_leaves': 31, 'objective': None, 'random_state': 8520, 'reg_alpha': 0.1, 'reg_lambda': 1.0, 'subsample': 1.0, 'subsample_for_bin': 200000, 'subsample_freq': 0, 'verbose': -1, 'min_data_in_leaf': 190, 'feature_fraction': 0.7, 'max_bin': 75, 'linear_tree': True, 'device': 'cpu'}
Fit Kwargs
    {'categorical_feature': ['country', 'store', 'product']}

🛈 API Reference    🗎 User Guide

# Feature importance (top 7)
# ==============================================================================
importance = forecaster.get_feature_importances()
importance.head(7)
feature importance
13 lag_14 581
6 lag_7 572
55 lag_56 428
75 week_sin 418
76 week_cos 369
0 lag_1 319
1 lag_2 312
# Prediction of test set
# ==============================================================================
steps = (data_test['date'].max() - data_test['date'].min()).days + 1
print('Number of steps to predict:', steps)
predictions = forecaster.predict(steps=steps, exog=exog_dict_pred, suppress_warnings=True)

# Reverse the log transformation of the predictions
# ==============================================================================
predictions['pred'] = np.expm1(predictions['pred'])
predictions.head(4)
Number of steps to predict: 1095
level pred
2017-01-01 Canada_Discount Stickers_Kaggle 938.299735
2017-01-01 Canada_Discount Stickers_Kaggle Tiers 715.476380
2017-01-01 Canada_Discount Stickers_Kerneler 418.295782
2017-01-01 Canada_Discount Stickers_Kerneler Dark Mode 501.718288

Two of the series were excluded from the training set because they contained only missing values. Skforecast allows users to forecast new series that were not seen during the model training; however, the predictions are not included by default. To obtain the predictions for these series, it is needed to use the argument last_window in the predict method.

# Predict unseen series during training
# ==============================================================================
last_window_unseen_series = pd.DataFrame(
    data    = np.nan,
    index   = pd.date_range(end='2016-12-31', periods=forecaster.window_size, freq='D'),
    columns = ['Canada_Discount Stickers_Holographic Goose', 'Kenya_Discount Stickers_Holographic Goose']
)
predictions_unseen_Series = forecaster.predict(
    steps             = steps,
    last_window       = last_window_unseen_series,
    exog              = exog_dict_pred,
    suppress_warnings = True
)
predictions_unseen_Series['pred'] = np.expm1(predictions_unseen_Series['pred'])
predictions_unseen_Series
level pred
2017-01-01 Canada_Discount Stickers_Holographic Goose 130.095322
2017-01-01 Kenya_Discount Stickers_Holographic Goose 13.561558
2017-01-02 Canada_Discount Stickers_Holographic Goose 131.262703
2017-01-02 Kenya_Discount Stickers_Holographic Goose 15.972330
2017-01-03 Canada_Discount Stickers_Holographic Goose 163.919315
... ... ...
2019-12-29 Kenya_Discount Stickers_Holographic Goose 102.435982
2019-12-30 Canada_Discount Stickers_Holographic Goose 344.756188
2019-12-30 Kenya_Discount Stickers_Holographic Goose 90.246888
2019-12-31 Canada_Discount Stickers_Holographic Goose 335.771299
2019-12-31 Kenya_Discount Stickers_Holographic Goose 91.815570

2190 rows × 2 columns

Submission results

predictions_all = pd.concat([predictions, predictions_unseen_Series])
submission = data_test.merge(
    predictions_all.reset_index(names=['date']),
    how      = 'left',
    left_on  = ['date', 'unique_id'],
    right_on = ['date', 'level'],
    validate = 'one_to_one'
)

submission = submission.loc[:, ['id', 'pred']]
submission = submission.rename(columns={'pred': 'num_sold'})
submission.to_csv('submission.csv', index=False)
submission
id num_sold
0 230130 130.095322
1 230131 938.299735
2 230132 715.476380
3 230133 418.295782
4 230134 501.718288
... ... ...
98545 328675 415.009558
98546 328676 3036.446657
98547 328677 2080.632355
98548 328678 1366.599418
98549 328679 1640.340384

98550 rows × 2 columns

# Update results to kaggle
# ==============================================================================
# !pip install kaggle
# !kaggle competitions submit -c playground-series-s5e1 -f submission.csv -m "uploading submission"

Session information

import session_info
session_info.show(html=False)
-----
feature_engine      1.8.3
holidays            0.72
lightgbm            4.6.0
matplotlib          3.10.1
numpy               2.2.5
optuna              3.6.2
pandas              2.2.3
session_info        v1.0.1
skforecast          0.16.0
sklearn             1.6.1
-----
IPython             9.1.0
jupyter_client      8.6.3
jupyter_core        5.7.2
notebook            6.5.7
-----
Python 3.12.9 | packaged by Anaconda, Inc. | (main, Feb  6 2025, 18:56:27) [GCC 11.2.0]
Linux-6.11.0-25-generic-x86_64-with-glibc2.39
-----
Session information updated at 2025-05-19 14:38

Citation

How to cite this document

If you use this document or any part of it, please acknowledge the source, thank you!

A Step-by-Step Guide to Global Time Series Forecasting Using Kaggle Sticker Sales Data by Joaquín Amat Rodrigo and Javier Escobar Ortiz, available under Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0 DEED) at https://cienciadedatos.net/documentos/py66-forecasting-sticker-sales-kaggle.html

DOI

How to cite skforecast

If you use skforecast for a publication, we would appreciate if you cite the published software.

Zenodo:

Amat Rodrigo, Joaquin, & Escobar Ortiz, Javier. (2024). skforecast (v0.16.0). Zenodo. https://doi.org/10.5281/zenodo.8382788

APA:

Amat Rodrigo, J., & Escobar Ortiz, J. (2024). skforecast (Version 0.16.0) [Computer software]. https://doi.org/10.5281/zenodo.8382788

BibTeX:

@software{skforecast, author = {Amat Rodrigo, Joaquin and Escobar Ortiz, Javier}, title = {skforecast}, version = {0.16.0}, month = {05}, year = {2025}, license = {BSD-3-Clause}, url = {https://skforecast.org/}, doi = {10.5281/zenodo.8382788} }


Did you like the article? Your support is important

Your contribution will help me to continue generating free educational content. Many thanks! 😊

Become a GitHub Sponsor Become a GitHub Sponsor

Creative Commons Licence

This work by Joaquín Amat Rodrigo and Javier Escobar Ortiz is licensed under a Attribution-NonCommercial-ShareAlike 4.0 International.

Allowed:

  • Share: copy and redistribute the material in any medium or format.

  • Adapt: remix, transform, and build upon the material.

Under the following terms:

  • Attribution: You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.

  • NonCommercial: You may not use the material for commercial purposes.

  • ShareAlike: If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.