Forecasting escalable: modelado de mil series temporales con un único modelo global

Si te gusta  Skforecast , ayúdanos dándonos una estrella en  GitHub! ⭐️

Forecasting escalable: modelado de mil series temporales con un único modelo global

Joaquín Amat Rodrigo, Javier Escobar Ortiz
November, 2024

Introduction

En escenarios que implican la predicción de cientos o miles de series temporales, surge una decisión crucial: ¿se deben desarrollar modelos individuales para cada serie o se debe utilizar un único modelo para manejarlas todas a la vez?

En la modelación de una sola serie (modelo de forecasting local), se crea un modelo de predicción independiente para cada serie temporal. Aunque este método proporciona una comprensión exhaustiva de cada serie, su escalabilidad puede verse dificultada por la necesidad de crear y mantener cientos o miles de modelos.

La modelización multiserie (modelo de forecasting global) consiste en crear un único modelo predictivo que tenga en cuenta todas las series temporales simultáneamente. Intenta captar los patrones básicos que rigen las series, mitigando así el ruido potencial que pueda introducir cada serie. Este enfoque es eficiente desde el punto de vista computacional, fácil de mantener y puede producir generalizaciones más sólidas, aunque potencialmente a costa de sacrificar algunos conocimientos individuales.

Este documento muestra cómo predecir más de 1,000 series temporales con un único modelo que incluye características exógenas, algunas de las cuales tienen valores diferentes en cada serie.

Librerías

In [1]:
# Data management
# ==============================================================================
import numpy as np
import pandas as pd

# Plots
# ==============================================================================
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
plt.style.use('seaborn-v0_8-darkgrid')

# Forecasting
# ==============================================================================
import skforecast
import lightgbm
from lightgbm import LGBMRegressor
from sklearn.preprocessing import OrdinalEncoder
from sklearn.compose import make_column_transformer
from sklearn.feature_selection import RFECV
from skforecast.recursive import ForecasterRecursiveMultiSeries
from skforecast.model_selection import TimeSeriesFold, OneStepAheadFold
from skforecast.model_selection import backtesting_forecaster_multiseries
from skforecast.model_selection import bayesian_search_forecaster_multiseries
from skforecast.feature_selection import select_features_multiseries
from skforecast.preprocessing import RollingFeatures
from skforecast.preprocessing import series_long_to_dict
from skforecast.preprocessing import exog_long_to_dict
from feature_engine.datetime import DatetimeFeatures
from feature_engine.creation import CyclicalFeatures
from skforecast.datasets import fetch_dataset

# Configuration
# ==============================================================================
import warnings
warnings.filterwarnings('once')

print('Versión skforecast:', skforecast.__version__)
print('Versión lightgbm:', lightgbm.__version__)
Versión skforecast: 0.14.0
Versión lightgbm: 4.5.0

Datos

Los datos utilizados en este documento se han obtenido del proyecto The Building Data Genome Project 2 https://github.com/buds-lab/building-data-genome-project-2. El conjunto de datos contiene información sobre el consumo energético de más de 1500 edificios. El rango temporal de los datos de las series temporales abarca los dos años completos (2016 y 2017) y la frecuencia es de mediciones horarias de electricidad, agua de calefacción y refrigeración, vapor y contadores de riego. Además, el conjunto de datos incluye información sobre las características de los edificios y las condiciones meteorológicas. Los datos se han agregado a una resolución diaria y solo se ha considerado la electricidad entre las distintas fuentes de energía.

In [2]:
# Descarga de datos
# ==============================================================================
data = fetch_dataset(name='bdg2_daily')
print("Data shape:", data.shape)
data.head(3)
bdg2_daily
----------
Daily energy consumption data from the The Building Data Genome Project 2 with
building metadata and weather data. https://github.com/buds-lab/building-data-
genome-project-2
Miller, C., Kathirgamanathan, A., Picchetti, B. et al. The Building Data Genome
Project 2, energy meter data from the ASHRAE Great Energy Predictor III
competition. Sci Data 7, 368 (2020). https://doi.org/10.1038/s41597-020-00712-x
Shape of the dataset: (1153518, 17)
Data shape: (1153518, 17)
Out[2]:
building_id meter_reading site_id primaryspaceusage sub_primaryspaceusage sqm lat lng timezone airTemperature cloudCoverage dewTemperature precipDepth1HR precipDepth6HR seaLvlPressure windDirection windSpeed
timestamp
2016-01-01 Bear_assembly_Angel 12808.1620 Bear Entertainment/public assembly Entertainment/public assembly 22117.0 37.871903 -122.260729 US/Pacific 6.1750 1.666667 -5.229167 0.0 0.0 1020.891667 68.750000 3.070833
2016-01-02 Bear_assembly_Angel 9251.0003 Bear Entertainment/public assembly Entertainment/public assembly 22117.0 37.871903 -122.260729 US/Pacific 8.0875 NaN -1.404167 0.0 0.0 1017.687500 76.666667 3.300000
2016-01-03 Bear_assembly_Angel 14071.6500 Bear Entertainment/public assembly Entertainment/public assembly 22117.0 37.871903 -122.260729 US/Pacific 10.1125 NaN 1.708333 -6.0 -2.0 1011.491667 91.666667 3.120833
In [3]:
# Rango de fechas disponibles
# ==============================================================================
print(
    f"Rango de fechas disponibles : {data.index.min()} --- {data.index.max()}  "
    f"(n_días={(data.index.max() - data.index.min()).days})"
)
Rango de fechas disponibles : 2016-01-01 00:00:00 --- 2017-12-31 00:00:00  (n_días=730)
In [4]:
# Rango de fechas disponibles por serie
# ==============================================================================
available_dates_per_series = (
    data
    .dropna(subset="meter_reading")
    .reset_index()
    .groupby("building_id")
    .agg(
        min_index=("timestamp", "min"),
        max_index=("timestamp", "max"),
        n_values=("timestamp", "nunique")
    )
)
display(available_dates_per_series)
print(f"Longitudes de las series : {available_dates_per_series.n_values.unique()}")
min_index max_index n_values
building_id
Bear_assembly_Angel 2016-01-01 2017-12-31 731
Bear_assembly_Beatrice 2016-01-01 2017-12-31 731
Bear_assembly_Danial 2016-01-01 2017-12-31 731
Bear_assembly_Diana 2016-01-01 2017-12-31 731
Bear_assembly_Genia 2016-01-01 2017-12-31 731
... ... ... ...
Wolf_public_Norma 2016-01-01 2017-12-31 731
Wolf_retail_Harriett 2016-01-01 2017-12-31 731
Wolf_retail_Marcella 2016-01-01 2017-12-31 731
Wolf_retail_Toshia 2016-01-01 2017-12-31 731
Wolf_science_Alfreda 2016-01-01 2017-12-31 731

1578 rows × 3 columns

Longitudes de las series : [731]

Todas las series temporales tienen la misma longitud, comenzando el 1 de enero de 2016 y terminando el 31 de diciembre de 2017. Las variables exógenas tienen pocos valores faltantes. Skforecast no requiere que las series temporales tengan la misma longitud, y se permiten valores faltantes siempre que el regresor subyacente pueda manejarlos, que es el caso de LightGBM, XGBoost y HistGradientBoostingRegressor.

In [5]:
# Valores nulos por variable
# ==============================================================================
data.isna().mean().mul(100).round(2)
Out[5]:
building_id               0.00
meter_reading             0.00
site_id                   0.00
primaryspaceusage         1.20
sub_primaryspaceusage     1.20
sqm                       0.00
lat                      14.83
lng                      14.83
timezone                  0.00
airTemperature            0.02
cloudCoverage             7.02
dewTemperature            0.03
precipDepth1HR            0.02
precipDepth6HR            0.02
seaLvlPressure            9.56
windDirection             0.02
windSpeed                 0.02
dtype: float64

Variables exógenas

Las variables exógenas son variables externas a la serie temporal y pueden utilizarse como predictores para mejorar la predicción. En este caso, las variables exógenas utilizadas son: las características de los edificios, variables de calendario y condiciones meteorológicas.

Warning

Las variables exógenas deben conocerse en el momento de la predicción. Por ejemplo, si la temperatura se utiliza como variable exógena, el valor de la temperatura para el día siguiente debe conocerse en el momento de la predicción. Si no se tiene acceso al valor de la temperatura, la predicción no será posible.

Características de los edificios

Uno de los atributos clave asociados a cada edificio es su uso designado. Este atributo puede desempeñar un papel crucial en el patrón de consumo de energía.

In [6]:
# Numero de edificios, tipos y subtipos
# ==============================================================================
print(f"Número de edificios: {data['building_id'].nunique()}")
print(f"Número de edificios types: {data['primaryspaceusage'].nunique()}")
print(f"Número de edificios subtypes: {data['sub_primaryspaceusage'].nunique()}")
Número de edificios: 1578
Número de edificios types: 16
Número de edificios subtypes: 104

Algunos tipos y subtipos de edificios aparecen con poca frecuencia en el conjunto de datos. Tipos con menos de 100 edificios y subtipos con menos de 50 edificios se agrupan en la categoría "Other".

In [7]:
# Agregación de categorías infrecuentes
# ==============================================================================
infrequent_types = (
    data
    .drop_duplicates(subset=['building_id'])['primaryspaceusage']
    .value_counts()
    .loc[lambda x: x < 100]
    .index
    .tolist()
)
infrequent_subtypes = (
    data
    .drop_duplicates(subset=['building_id'])['sub_primaryspaceusage']
    .value_counts()
    .loc[lambda x: x < 50]
    .index
    .tolist()
)

data['primaryspaceusage'] = np.where(
    data['primaryspaceusage'].isin(infrequent_types),
    'Other',
    data['primaryspaceusage']
)
data['sub_primaryspaceusage'] = np.where(
    data['sub_primaryspaceusage'].isin(infrequent_subtypes),
    'Other',
    data['sub_primaryspaceusage']
)

display(data.drop_duplicates(subset=['building_id'])['primaryspaceusage'].value_counts())
display(data.drop_duplicates(subset=['building_id', 'sub_primaryspaceusage'])['sub_primaryspaceusage'].value_counts())
primaryspaceusage
Education                        604
Office                           296
Entertainment/public assembly    203
Public services                  166
Lodging/residential              149
Other                            141
Name: count, dtype: int64
sub_primaryspaceusage
Other                          612
Office                         295
College Classroom              131
College Laboratory             116
K-12 School                    109
Dormitory                       91
Primary/Secondary Classroom     84
Education                       67
Library                         54
Name: count, dtype: int64

Variables de calendario

In [8]:
# Variables categóricas
# ==============================================================================
features_to_extract = [
    'month',
    'week',
    'day_of_week',
]
calendar_transformer = DatetimeFeatures(
                            variables           = 'index',
                            features_to_extract = features_to_extract,
                            drop_original       = False,
                       )
data = calendar_transformer.fit_transform(data)
In [9]:
# Cyclical encoding
# ==============================================================================
features_to_encode = [
    "month",
    "week",
    "day_of_week",
]
max_values = {
    "month": 12,
    "week": 52,
    "day_of_week": 6,
}
cyclical_encoder = CyclicalFeatures(
                        variables     = features_to_encode,
                        max_values    = max_values,
                        drop_original = False
                   )

data = cyclical_encoder.fit_transform(data)
data.head(3)
Out[9]:
building_id meter_reading site_id primaryspaceusage sub_primaryspaceusage sqm lat lng timezone airTemperature ... windSpeed month week day_of_week month_sin month_cos week_sin week_cos day_of_week_sin day_of_week_cos
timestamp
2016-01-01 Bear_assembly_Angel 12808.1620 Bear Entertainment/public assembly Other 22117.0 37.871903 -122.260729 US/Pacific 6.1750 ... 3.070833 1 53 4 0.5 0.866025 0.120537 0.992709 -8.660254e-01 -0.5
2016-01-02 Bear_assembly_Angel 9251.0003 Bear Entertainment/public assembly Other 22117.0 37.871903 -122.260729 US/Pacific 8.0875 ... 3.300000 1 53 5 0.5 0.866025 0.120537 0.992709 -8.660254e-01 0.5
2016-01-03 Bear_assembly_Angel 14071.6500 Bear Entertainment/public assembly Other 22117.0 37.871903 -122.260729 US/Pacific 10.1125 ... 3.120833 1 53 6 0.5 0.866025 0.120537 0.992709 -2.449294e-16 1.0

3 rows × 26 columns

✎ Note

For more information about calendar features and cyclical encoding visit Calendar features and Cyclical features in time series forecasting.

Variables meteorológicas

Las variables meteorológicas se han registrado a nivel de localidad, lo que significa que los datos meteorológicos varían según la ubicación del edificio, incluso en el mismo instante de tiempo. En otras palabras, aunque las variables exógenas son consistentes en todas las series, sus valores difieren por ubicación.

In [10]:
# Valores meteorológicos para cada ubicacion para ina fecha dada
# ==============================================================================
data.loc["2016-01-01"].groupby("site_id", observed=True).agg(
    {
        "airTemperature": "first",
        "cloudCoverage": "first",
        "dewTemperature": "first",
        "precipDepth1HR": "first",
        "precipDepth6HR": "first",
        "seaLvlPressure": "first",
        "windDirection": "first",
        "windSpeed": "first",
    }
)
Out[10]:
airTemperature cloudCoverage dewTemperature precipDepth1HR precipDepth6HR seaLvlPressure windDirection windSpeed
site_id
Bear 6.175000 1.666667 -5.229167 0.0 0.0 1020.891667 68.750000 3.070833
Bobcat -11.595833 0.000000 -17.041667 0.0 0.0 1034.466667 260.869565 3.033333
Bull 7.612500 4.000000 0.708333 -2.0 -1.0 1031.762500 130.416667 3.712500
Cockatoo -2.000000 4.000000 -4.066667 0.0 0.0 NaN 281.333333 4.826667
Crow -1.787500 NaN -3.595833 15.0 13.0 1011.670833 236.250000 3.666667
Eagle 3.187500 1.000000 -4.487500 0.0 0.0 1017.445833 287.500000 3.470833
Fox 10.200000 1.000000 -2.804167 0.0 0.0 1018.512500 47.083333 0.470833
Gator 23.291667 5.600000 19.625000 -3.0 0.0 1018.663636 150.416667 2.520833
Hog -5.583333 1.142857 -9.741667 -2.0 -1.0 1019.545833 260.416667 4.758333
Lamb 6.913043 0.000000 5.434783 0.0 0.0 NaN 123.181818 8.017391
Moose -1.787500 NaN -3.595833 15.0 13.0 1011.670833 236.250000 3.666667
Mouse 5.387500 0.000000 3.879167 0.0 0.0 1016.941667 116.666667 4.470833
Panther 23.291667 5.600000 19.625000 -3.0 0.0 1018.663636 150.416667 2.520833
Peacock 5.558333 0.000000 -1.541667 0.0 0.0 1019.783333 120.000000 1.275000
Rat 5.633333 4.727273 -2.112500 0.0 0.0 1020.450000 314.782609 3.816667
Robin 5.387500 0.000000 3.879167 0.0 0.0 1016.941667 116.666667 4.470833
Shrew 5.387500 0.000000 3.879167 0.0 0.0 1016.941667 116.666667 4.470833
Swan 6.054167 0.400000 -2.591667 0.0 0.0 1021.216667 154.000000 1.783333
Wolf 5.716667 6.625000 3.208333 0.0 1.0 1007.625000 140.000000 8.875000

Skforecast permite incluir variables exógenas distintas y/o valores distintos para cada serie dentro del conjunto de datos (detalles proporcionados en la siguiente sección).

Variables categóricas

LightGBM permite incluir variables categóricas en el modelo sin necesidad de preprocesamiento. Para permitir la detección automática de variables categóricas en un Forecaster, primero las variables categóricas deben codificarse como enteros (codificación ordinal) y luego almacenarse como tipo category. Esto es necesario porque skforecast utiliza una matriz numérica internamente para acelerar el cálculo, y LightGBM requiere que las características categóricas estén codificadas como category para ser detectadas automáticamente. También es necesario establecer el parámetro categorical_features en 'auto' durante la inicialización del modelo de forecasting utilizando fit_kwargs = {'categorical_feature': 'auto'}.

Warning

Las cuatro principales implementaciones de gradient boosting - LightGBM, scikit-learn's HistogramGradientBoosting, XGBoost y CatBoost - son capaces de manejar directamente las variables categóricas dentro del modelo. Sin embargo, es importante tener en cuenta que cada implementación tiene sus propias configuraciones, beneficios y posibles problemas. Para comprender completamente cómo utilizar estas implementaciones, se recomienda consultar la guía del usuario de skforecast para una comprensión detallada.
In [11]:
# Transformer: ordinal encoding
# ==============================================================================
# Un ColumnTransformer se utiliza para transformar las variables categóricas
# (no numéricas) utilizando la codificación ordinal. Las varaibles numéricas
# se dejan sin modificar. Los valores perdidos se codifican como -1. Si una
# nueva categoría se encuentra en el conjunto de prueba, se codifica como -1.
categorical_features = ['primaryspaceusage', 'sub_primaryspaceusage', 'timezone']
transformer_exog = make_column_transformer(
                       (
                           OrdinalEncoder(
                               dtype=float,
                               handle_unknown="use_encoded_value",
                               unknown_value=np.nan,
                               encoded_missing_value=np.nan
                           ),
                           categorical_features
                       ),
                       remainder="passthrough",
                       verbose_feature_names_out=False,
                   ).set_output(transform="pandas")
transformer_exog
Out[11]:
ColumnTransformer(remainder='passthrough',
                  transformers=[('ordinalencoder',
                                 OrdinalEncoder(dtype=<class 'float'>,
                                                handle_unknown='use_encoded_value',
                                                unknown_value=nan),
                                 ['primaryspaceusage', 'sub_primaryspaceusage',
                                  'timezone'])],
                  verbose_feature_names_out=False)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

Cuando se crea un Forecaster con LGBMRegressor, es necesario especificar cómo manejar las columnas categóricas utilizando el argumento fit_kwargs. Esto se debe a que el argumento categorical_feature solo se especifica en el método fit de LGBMRegressor, y no durante su inicialización.

Modelado y predicción

ForecasterRecursiveMultiSeries permite modelar series temporales de diferentes longitudes y utilizando distintas variables exógenas. Cuando las series tienen longitudes diferentes, los datos deben transformarse en un diccionario. Las claves del diccionario son los nombres de las series y los valores son las propias series. Para ello, se utiliza la función series_long_to_dict, que toma un DataFrame en «formato largo» y devuelve un diccionario de Series de Pandas. Del mismo modo, cuando las variables exógenas son diferentes (valores o variables) para cada serie, los datos deben transformarse en un diccionario. Las claves del diccionario son los nombres de las series y los valores son las propias variables exógenas. Se utiliza la función exog_long_to_dict, que toma el DataFrame en «formato largo» y devuelve un diccionario de variables exógenas (series de Pandas o DataFrames de Pandas).

Cuando todas las series tienen la misma longitud y las mismas variables exógenas, no es necesario utilizar diccionarios. Las series se pueden pasar como un único DataFrame con cada serie en una columna, y las variables exógenas se pueden pasar como un DataFrame con la misma longitud que la serie.

✎ Note

Para más información sobre cómo modelar series de diferentes longitudes y utilizar diferentes variables exógenas, visite Global Forecasting Models: Time series with different lengths and different exogenous variables.
In [12]:
# Varaibles exógenas para el modelo
# ==============================================================================
exog_features = [
    "primaryspaceusage",
    "sub_primaryspaceusage",
    "timezone",
    "sqm",
    "airTemperature",
    "cloudCoverage",
    "dewTemperature",
    "precipDepth1HR",
    "precipDepth6HR",
    "seaLvlPressure",
    "windDirection",
    "windSpeed",
    "day_of_week_sin",
    "day_of_week_cos",
    "week_sin",
    "week_cos",
    "month_sin",
    "month_cos",
]
In [13]:
# Transformación de las series y variables exógenas a formato dict
# ==============================================================================
series_dict = series_long_to_dict(
    data      = data.reset_index(),
    series_id = 'building_id',
    index     = 'timestamp',
    values    = 'meter_reading',
    freq      = 'D'
)

exog_dict = exog_long_to_dict(
    data      = data[exog_features + ['building_id']].reset_index(),
    series_id = 'building_id',
    index     = 'timestamp',
    freq      = 'D'
)

Para entrenar los modelos, buscar los hiperparámetros óptimos y evaluar su rendimiento predictivo, los datos se dividen en tres conjuntos separados: entrenamiento, validación y test.

In [14]:
# Partiticón de los datos en entrenamiento, validación y test
# ==============================================================================
data = data.sort_index()
end_train = '2017-08-31 23:59:00'
end_validation = '2017-10-31 23:59:00'
series_dict_train = {k: v.loc[: end_train,] for k, v in series_dict.items()}
series_dict_valid = {k: v.loc[end_train: end_validation,] for k, v in series_dict.items()}
series_dict_test = {k: v.loc[end_validation:,] for k, v in series_dict.items()}
exog_dict_train = {k: v.loc[: end_train,] for k, v in exog_dict.items()}
exog_dict_valid = {k: v.loc[end_train: end_validation,] for k, v in exog_dict.items()}
exog_dict_test = {k: v.loc[end_validation:,] for k, v in exog_dict.items()}

print(
    f"Rage of dates available : {data.index.min()} --- {data.index.max()} "
    f"(n_days={(data.index.max() - data.index.min()).days})"
)
print(
    f"  Dates for training    : {data.loc[: end_train, :].index.min()} --- {data.loc[: end_train, :].index.max()} "
    f"(n_days={(data.loc[: end_train, :].index.max() - data.loc[: end_train, :].index.min()).days})"
)
print(
    f"  Dates for validation  : {data.loc[end_train:end_validation, :].index.min()} --- {data.loc[end_train:end_validation, :].index.max()} "
    f"(n_days={(data.loc[end_train:end_validation, :].index.max() - data.loc[end_train:end_validation, :].index.min()).days})"
)
print(
    f"  Dates for test        : {data.loc[end_validation:, :].index.min()} --- {data.loc[end_validation:, :].index.max()} "
    f"(n_days={(data.loc[end_validation:, :].index.max() - data.loc[end_validation:, :].index.min()).days})"
)
Rage of dates available : 2016-01-01 00:00:00 --- 2017-12-31 00:00:00 (n_days=730)
  Dates for training    : 2016-01-01 00:00:00 --- 2017-08-31 00:00:00 (n_days=608)
  Dates for validation  : 2017-09-01 00:00:00 --- 2017-10-31 00:00:00 (n_days=60)
  Dates for test        : 2017-11-01 00:00:00 --- 2017-12-31 00:00:00 (n_days=60)

Búsqueda de hiperparámetros

La búsqueda de hiperparámetros y lags implica probar sistemáticamente diferentes valores de hiperparámetros (y/o lags) para encontrar la configuración óptima que ofrezca el mejor rendimiento. skforecast proporciona dos métodos diferentes para evaluar cada configuración candidata:

  • Backtesting: en este método, el modelo predice varios pasos a futuro en cada iteración, utilizando el mismo horizonte de predicción y la misma estrategia de reentrenamiento que se utilizarían si se desplegara el modelo. De este modo, se simula un escenario de predicción real en el que el modelo se reentrena y actualiza a lo largo del tiempo.

  • One-Step Ahead: Evalúa el modelo utilizando solo predicciones de un paso a futuro. Este método es más rápido porque requiere menos iteraciones, pero solo evalua el rendimiento del modelo en el siguiente paso temporal (t+1).

Cada método utiliza una estrategia de evaluación diferente, por lo que pueden producir resultados distintos. Sin embargo, a largo plazo, se espera que ambos métodos converjan a selecciones similares de hiperparámetros óptimos. El método de One-Step Ahead es mucho más rápido que el backtesting porque requiere menos iteraciones, pero solo prueba el rendimiento del modelo en el siguiente instante de tiempo. Se recomienda realizar un backtest del modelo final para obtener una estimación más precisa del rendimiento cuando se predicen varios pasos a futuro.

In [15]:
# Crear forecaster
# ==============================================================================
window_features = RollingFeatures(stats=['mean', 'min', 'max'], window_sizes=7)
forecaster = ForecasterRecursiveMultiSeries(
                regressor          = LGBMRegressor(random_state=8520, verbose=-1),
                lags               = 14,
                window_features    = window_features,
                transformer_series = None,
                transformer_exog   = transformer_exog,
                fit_kwargs         = {'categorical_feature': categorical_features},
                encoding           = "ordinal"
            )
In [ ]:
# Bayesian search con OneStepAheadFold
# ==============================================================================
def search_space(trial):
    search_space  = {
        'lags'            : trial.suggest_categorical('lags', [31, 62]),
        'n_estimators'    : trial.suggest_int('n_estimators', 200, 800, step=100),
        'max_depth'       : trial.suggest_int('max_depth', 3, 8, step=1),
        'min_data_in_leaf': trial.suggest_int('min_data_in_leaf', 25, 500),
        'learning_rate'   : trial.suggest_float('learning_rate', 0.01, 0.5),
        'feature_fraction': trial.suggest_float('feature_fraction', 0.5, 0.8, step=0.1),
        'max_bin'         : trial.suggest_int('max_bin', 50, 100, step=25),
        'reg_alpha'       : trial.suggest_float('reg_alpha', 0, 1, step=0.1),
        'reg_lambda'      : trial.suggest_float('reg_lambda', 0, 1, step=0.1)
    }

    return search_space

cv = OneStepAheadFold(initial_train_size=608) # Tamaño del conjunto de entrenamiento

results_search, best_trial = bayesian_search_forecaster_multiseries(
    forecaster        = forecaster,
    series            = {k: v.loc[:end_validation,] for k, v in series_dict.items()},
    exog              = {k: v.loc[:end_validation, exog_features] for k, v in exog_dict.items()},
    cv                = cv,
    search_space      = search_space,
    n_trials          = 10,
    metric            = "mean_absolute_error",
    return_best       = True,
    verbose           = False,
    n_jobs            = "auto",
    show_progress     = True,
    suppress_warnings = True,
)

best_params = results_search.at[0, 'params']
best_lags = results_search.at[0, 'lags']
results_search.head(3)
/home/ubuntu/anaconda3/envs/skforecast_14_py12/lib/python3.12/site-packages/skforecast/recursive/_forecaster_recursive_multiseries.py:1077: MissingValuesWarning: NaNs detected in `X_train`. Some regressors do not allow NaN values during training. If you want to drop them, set `forecaster.dropna_from_series = True`. 
 You can suppress this warning using: warnings.simplefilter('ignore', category=MissingValuesWarning)
  warnings.warn(
`Forecaster` refitted using the best-found lags and parameters, and the whole data set: 
  Lags: [ 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48
 49 50 51 52 53 54 55 56 57 58 59 60 61 62] 
  Parameters: {'n_estimators': 600, 'max_depth': 6, 'min_data_in_leaf': 188, 'learning_rate': 0.1590191866233202, 'feature_fraction': 0.6, 'max_bin': 100, 'reg_alpha': 0.9, 'reg_lambda': 0.5}
  Backtesting metric: 265.55728709347545
  Levels: ['Bear_assembly_Angel', 'Bear_assembly_Beatrice', 'Bear_assembly_Danial', 'Bear_assembly_Diana', 'Bear_assembly_Genia', 'Bear_assembly_Harry', 'Bear_assembly_Jose', 'Bear_assembly_Roxy', 'Bear_assembly_Ruby', 'Bear_education_Alfredo', 'Bear_education_Alvaro', 'Bear_education_Arnold', 'Bear_education_Augusta', 'Bear_education_Austin', 'Bear_education_Babara', 'Bear_education_Benita', 'Bear_education_Benjamin', 'Bear_education_Bob', 'Bear_education_Bonita', 'Bear_education_Bulah', 'Bear_education_Carlo', 'Bear_education_Chad', 'Bear_education_Chana', 'Bear_education_Chun', 'Bear_education_Clint', 'Bear_education_Curtis', 'Bear_education_Danna', 'Bear_education_Darrell', 'Bear_education_Deborah', 'Bear_education_Deena', 'Bear_education_Derek', 'Bear_education_Destiny', 'Bear_education_Elise', 'Bear_education_Fannie', 'Bear_education_Gavin', 'Bear_education_Herb', 'Bear_education_Herman', 'Bear_education_Holly', 'Bear_education_Irene', 'Bear_education_Iris', 'Bear_education_Katy', 'Bear_education_Lashanda', 'Bear_education_Laurette', 'Bear_education_Leonel', 'Bear_education_Lewis', 'Bear_education_Lidia', 'Bear_education_Lila', 'Bear_education_Liliana', 'Bear_education_Lorie', 'Bear_education_Mara', 'Bear_education_Marie', 'Bear_education_Marsha', 'Bear_education_Marta', 'Bear_education_Maryjane', 'Bear_education_Merrill', 'Bear_education_Millie', 'Bear_education_Nanette', 'Bear_education_Oscar', 'Bear_education_Owen', 'Bear_education_Paola', 'Bear_education_Patricia', 'Bear_education_Pattie', 'Bear_education_Rebecca', 'Bear_education_Sandy', 'Bear_education_Santos', 'Bear_education_Sharon', 'Bear_education_Shelley', 'Bear_education_Tara', 'Bear_education_Thuy', 'Bear_education_Val', 'Bear_education_Wade', 'Bear_education_Wilton', 'Bear_education_Wm', 'Bear_education_Yuri', 'Bear_education_Yvette', 'Bear_education_Zandra', 'Bear_lodging_Dannie', 'Bear_lodging_Erick', 'Bear_lodging_Esperanza', 'Bear_lodging_Evan', 'Bear_parking_Bridget', 'Bear_parking_Bruce', 'Bear_parking_Gordon', 'Bear_public_Arlean', 'Bear_public_Jocelyn', 'Bear_public_Lowell', 'Bear_public_Orville', 'Bear_public_Rayna', 'Bear_public_Valorie', 'Bear_science_Alison', 'Bear_science_Eleanor', 'Bear_utility_Sidney', 'Bobcat_assembly_Adam', 'Bobcat_assembly_Billy', 'Bobcat_assembly_Camilla', 'Bobcat_assembly_Franklin', 'Bobcat_education_Alissa', 'Bobcat_education_Angela', 'Bobcat_education_Barbra', 'Bobcat_education_Coleman', 'Bobcat_education_Dylan', 'Bobcat_education_Emile', 'Bobcat_education_Hollis', 'Bobcat_education_Jayne', 'Bobcat_education_Karin', 'Bobcat_education_Marisha', 'Bobcat_education_Miki', 'Bobcat_education_Monte', 'Bobcat_education_Rodrick', 'Bobcat_education_Rosalva', 'Bobcat_education_Seth', 'Bobcat_education_Toni', 'Bobcat_education_Whitney', 'Bobcat_lodging_Darin', 'Bobcat_lodging_Melaine', 'Bobcat_lodging_Nickolas', 'Bobcat_office_Alma', 'Bobcat_office_Justine', 'Bobcat_office_Kassandra', 'Bobcat_office_Melody', 'Bobcat_office_Nikita', 'Bobcat_other_Howard', 'Bobcat_other_Jovita', 'Bobcat_other_Timothy', 'Bobcat_public_Angie', 'Bobcat_science_Tammy', 'Bobcat_warehouse_Charlie', 'Bull_assembly_Amalia', 'Bull_assembly_Beau', 'Bull_assembly_Brandon', 'Bull_assembly_Daryl', 'Bull_assembly_Dorethea', 'Bull_assembly_Freddie', 'Bull_assembly_Gerri', 'Bull_assembly_Gigi', 'Bull_assembly_Goldie', 'Bull_assembly_Katia', 'Bull_assembly_Lance', 'Bull_assembly_Lesa', 'Bull_assembly_Maren', 'Bull_assembly_Nathanial', 'Bull_assembly_Newton', 'Bull_assembly_Nick', 'Bull_assembly_Vanessa', 'Bull_education_Annette', 'Bull_education_Antonia', 'Bull_education_Arthur', 'Bull_education_Barry', 'Bull_education_Bernice', 'Bull_education_Brady', 'Bull_education_Brain', 'Bull_education_Brandi', 'Bull_education_Brenda', 'Bull_education_Bryon', 'Bull_education_Carl', 'Bull_education_Clarice', 'Bull_education_Clarita', 'Bull_education_Dakota', 'Bull_education_Dan', 'Bull_education_Dania', 'Bull_education_Delia', 'Bull_education_Dora', 'Bull_education_Dottie', 'Bull_education_Elva', 'Bull_education_Fabiola', 'Bull_education_Geneva', 'Bull_education_Genie', 'Bull_education_Gregory', 'Bull_education_Hayley', 'Bull_education_Jae', 'Bull_education_Jeffery', 'Bull_education_Joseph', 'Bull_education_Juan', 'Bull_education_Kendra', 'Bull_education_Krista', 'Bull_education_Kristal', 'Bull_education_Lenny', 'Bull_education_Linnie', 'Bull_education_Luke', 'Bull_education_Lydia', 'Bull_education_Lyn', 'Bull_education_Magaret', 'Bull_education_Mario', 'Bull_education_Matilda', 'Bull_education_Mervin', 'Bull_education_Miquel', 'Bull_education_Miranda', 'Bull_education_Myra', 'Bull_education_Nichole', 'Bull_education_Nina', 'Bull_education_Nolan', 'Bull_education_Olive', 'Bull_education_Pablo', 'Bull_education_Pamela', 'Bull_education_Patrina', 'Bull_education_Racheal', 'Bull_education_Reina', 'Bull_education_Reynaldo', 'Bull_education_Roland', 'Bull_education_Roseann', 'Bull_education_Sebastian', 'Bull_education_Shona', 'Bull_education_Stewart', 'Bull_education_Summer', 'Bull_education_Tracey', 'Bull_education_Venita', 'Bull_education_Violeta', 'Bull_lodging_Abby', 'Bull_lodging_Allen', 'Bull_lodging_Anibal', 'Bull_lodging_Caren', 'Bull_lodging_Carie', 'Bull_lodging_Charlotte', 'Bull_lodging_Danielle', 'Bull_lodging_Dave', 'Bull_lodging_Elena', 'Bull_lodging_Graciela', 'Bull_lodging_Hugo', 'Bull_lodging_Jeremiah', 'Bull_lodging_Leonard', 'Bull_lodging_Lettie', 'Bull_lodging_Melissa', 'Bull_lodging_Perry', 'Bull_lodging_Terence', 'Bull_lodging_Travis', 'Bull_lodging_Xavier', 'Bull_office_Anne', 'Bull_office_Chantel', 'Bull_office_Claudia', 'Bull_office_Debbie', 'Bull_office_Efren', 'Bull_office_Ella', 'Bull_office_Hilton', 'Bull_office_Ivette', 'Bull_office_Lilla', 'Bull_office_Mai', 'Bull_office_Marco', 'Bull_office_Myron', 'Bull_office_Nicolas', 'Bull_office_Rob', 'Bull_office_Sally', 'Bull_office_Trevor', 'Bull_office_Yvonne', 'Bull_public_Hyun', 'Bull_public_Jefferson', 'Bull_services_Jeanmarie', 'Bull_services_Juanita', 'Bull_services_Nadine', 'Bull_services_Rachelle', 'Bull_services_Winford', 'Cockatoo_assembly_Britt', 'Cockatoo_assembly_Doyle', 'Cockatoo_assembly_Ed', 'Cockatoo_assembly_Emilio', 'Cockatoo_assembly_Evelyn', 'Cockatoo_assembly_Fernanda', 'Cockatoo_assembly_Genoveva', 'Cockatoo_assembly_Griselda', 'Cockatoo_assembly_Heath', 'Cockatoo_assembly_Meredith', 'Cockatoo_assembly_Mimi', 'Cockatoo_assembly_Pierre', 'Cockatoo_assembly_Ralph', 'Cockatoo_assembly_Rodger', 'Cockatoo_assembly_Tabitha', 'Cockatoo_assembly_Valencia', 'Cockatoo_education_Amira', 'Cockatoo_education_Arlen', 'Cockatoo_education_Brendan', 'Cockatoo_education_Brigitte', 'Cockatoo_education_Charity', 'Cockatoo_education_Christi', 'Cockatoo_education_Claudine', 'Cockatoo_education_Clifton', 'Cockatoo_education_Collin', 'Cockatoo_education_Deloris', 'Cockatoo_education_Doreen', 'Cockatoo_education_Erik', 'Cockatoo_education_Eunice', 'Cockatoo_education_Evangeline', 'Cockatoo_education_Flora', 'Cockatoo_education_Gussie', 'Cockatoo_education_Helene', 'Cockatoo_education_Jack', 'Cockatoo_education_Janet', 'Cockatoo_education_Joeann', 'Cockatoo_education_Joel', 'Cockatoo_education_Jon', 'Cockatoo_education_Julia', 'Cockatoo_education_Julio', 'Cockatoo_education_June', 'Cockatoo_education_Latrice', 'Cockatoo_education_Laurence', 'Cockatoo_education_Lionel', 'Cockatoo_education_Magdalena', 'Cockatoo_education_Marva', 'Cockatoo_education_Maynard', 'Cockatoo_education_Mayra', 'Cockatoo_education_Melanie', 'Cockatoo_education_Minh', 'Cockatoo_education_Nelda', 'Cockatoo_education_Oliver', 'Cockatoo_education_Orlando', 'Cockatoo_education_Rita', 'Cockatoo_education_Shawn', 'Cockatoo_education_Sheryl', 'Cockatoo_education_Terrence', 'Cockatoo_education_Tyler', 'Cockatoo_education_Victoria', 'Cockatoo_industrial_Nathaniel', 'Cockatoo_industrial_Sarita', 'Cockatoo_lodging_Aimee', 'Cockatoo_lodging_Albert', 'Cockatoo_lodging_Alicia', 'Cockatoo_lodging_Ana', 'Cockatoo_lodging_Carmen', 'Cockatoo_lodging_Cletus', 'Cockatoo_lodging_Elvia', 'Cockatoo_lodging_Emory', 'Cockatoo_lodging_Eric', 'Cockatoo_lodging_Homer', 'Cockatoo_lodging_Jarrod', 'Cockatoo_lodging_Javier', 'Cockatoo_lodging_Jim', 'Cockatoo_lodging_Jimmie', 'Cockatoo_lodging_Johnathan', 'Cockatoo_lodging_Josephine', 'Cockatoo_lodging_Judi', 'Cockatoo_lodging_Katharine', 'Cockatoo_lodging_Kerri', 'Cockatoo_lodging_Kyle', 'Cockatoo_lodging_Lana', 'Cockatoo_lodging_Linwood', 'Cockatoo_lodging_Lynne', 'Cockatoo_lodging_Mandy', 'Cockatoo_lodging_Olga', 'Cockatoo_lodging_Raphael', 'Cockatoo_lodging_Tesha', 'Cockatoo_lodging_Tessie', 'Cockatoo_office_Ada', 'Cockatoo_office_Alton', 'Cockatoo_office_Christy', 'Cockatoo_office_Delores', 'Cockatoo_office_Elbert', 'Cockatoo_office_Gail', 'Cockatoo_office_Georgia', 'Cockatoo_office_Giovanni', 'Cockatoo_office_Jimmy', 'Cockatoo_office_Jodie', 'Cockatoo_office_Kristin', 'Cockatoo_office_Laila', 'Cockatoo_office_Lorraine', 'Cockatoo_office_Margaret', 'Cockatoo_office_Paige', 'Cockatoo_office_Pansy', 'Cockatoo_office_Rodney', 'Cockatoo_office_Roxanna', 'Cockatoo_public_Caleb', 'Cockatoo_public_Chiquita', 'Cockatoo_public_Harland', 'Cockatoo_public_Leah', 'Cockatoo_public_Shad', 'Cockatoo_public_Valerie', 'Cockatoo_religion_Diedre', 'Cockatoo_science_Rex', 'Cockatoo_utility_Kimiko', 'Cockatoo_utility_Sherri', 'Crow_education_Kate', 'Crow_education_Keisha', 'Crow_education_Marlin', 'Crow_education_Omer', 'Crow_education_Winston', 'Eagle_assembly_Benny', 'Eagle_assembly_Candice', 'Eagle_assembly_Estelle', 'Eagle_assembly_Herbert', 'Eagle_assembly_Ian', 'Eagle_assembly_Josie', 'Eagle_assembly_Lacy', 'Eagle_assembly_Latrina', 'Eagle_assembly_Margret', 'Eagle_assembly_Noel', 'Eagle_assembly_Portia', 'Eagle_education_Alberto', 'Eagle_education_April', 'Eagle_education_Brianne', 'Eagle_education_Brooke', 'Eagle_education_Cassie', 'Eagle_education_Edith', 'Eagle_education_Eileen', 'Eagle_education_Jewell', 'Eagle_education_Lino', 'Eagle_education_Luther', 'Eagle_education_Maragret', 'Eagle_education_Norah', 'Eagle_education_Paul', 'Eagle_education_Peter', 'Eagle_education_Petra', 'Eagle_education_Raul', 'Eagle_education_Roman', 'Eagle_education_Samantha', 'Eagle_education_Shana', 'Eagle_education_Shanna', 'Eagle_education_Shante', 'Eagle_education_Sheena', 'Eagle_education_Sherrill', 'Eagle_education_Teresa', 'Eagle_education_Wesley', 'Eagle_education_Will', 'Eagle_food_Jennifer', 'Eagle_food_Kay', 'Eagle_health_Amy', 'Eagle_health_Athena', 'Eagle_health_Gregoria', 'Eagle_health_Jodi', 'Eagle_health_Lucinda', 'Eagle_health_Margo', 'Eagle_health_Reba', 'Eagle_health_Reuben', 'Eagle_health_Trisha', 'Eagle_health_Vincenza', 'Eagle_lodging_Andy', 'Eagle_lodging_Blake', 'Eagle_lodging_Casey', 'Eagle_lodging_Dawn', 'Eagle_lodging_Edgardo', 'Eagle_lodging_Garland', 'Eagle_lodging_Stephanie', 'Eagle_lodging_Terri', 'Eagle_lodging_Tressa', 'Eagle_lodging_Trina', 'Eagle_office_Amanda', 'Eagle_office_Amie', 'Eagle_office_Bridgett', 'Eagle_office_Chantelle', 'Eagle_office_Chauncey', 'Eagle_office_Dallas', 'Eagle_office_Damian', 'Eagle_office_Demetra', 'Eagle_office_Donovan', 'Eagle_office_Efrain', 'Eagle_office_Elia', 'Eagle_office_Elias', 'Eagle_office_Elvis', 'Eagle_office_Flossie', 'Eagle_office_Francis', 'Eagle_office_Freida', 'Eagle_office_Henriette', 'Eagle_office_Isidro', 'Eagle_office_Jackie', 'Eagle_office_Jeff', 'Eagle_office_Joette', 'Eagle_office_Katheleen', 'Eagle_office_Lamont', 'Eagle_office_Lane', 'Eagle_office_Lillian', 'Eagle_office_Mable', 'Eagle_office_Mandi', 'Eagle_office_Marisela', 'Eagle_office_Michele', 'Eagle_office_Nereida', 'Eagle_office_Norbert', 'Eagle_office_Patrice', 'Eagle_office_Phyllis', 'Eagle_office_Randolph', 'Eagle_office_Remedios', 'Eagle_office_Ryan', 'Eagle_office_Sheree', 'Eagle_office_Sonya', 'Eagle_office_Tia', 'Eagle_office_Yadira', 'Eagle_public_Alvin', 'Eagle_public_Henry', 'Eagle_public_Minnie', 'Eagle_public_Missy', 'Eagle_public_Ola', 'Eagle_public_Pearle', 'Eagle_public_Preston', 'Fox_assembly_Adrianne', 'Fox_assembly_Audrey', 'Fox_assembly_Boyce', 'Fox_assembly_Bradley', 'Fox_assembly_Carlos', 'Fox_assembly_Cathy', 'Fox_assembly_Cecelia', 'Fox_assembly_Christie', 'Fox_assembly_Cindy', 'Fox_assembly_Dixie', 'Fox_assembly_Emma', 'Fox_assembly_Gary', 'Fox_assembly_Jerrod', 'Fox_assembly_Johnnie', 'Fox_assembly_Kathie', 'Fox_assembly_Lakeisha', 'Fox_assembly_Leeanne', 'Fox_assembly_Renna', 'Fox_assembly_Sheldon', 'Fox_assembly_Terrell', 'Fox_assembly_Tony', 'Fox_education_Andre', 'Fox_education_Ashli', 'Fox_education_Burton', 'Fox_education_Carleen', 'Fox_education_Charles', 'Fox_education_Claire', 'Fox_education_Claude', 'Fox_education_Cynthia', 'Fox_education_Delma', 'Fox_education_Dewayne', 'Fox_education_Dominique', 'Fox_education_Eddy', 'Fox_education_Eldon', 'Fox_education_Elizabeth', 'Fox_education_Elois', 'Fox_education_Elvira', 'Fox_education_Etta', 'Fox_education_Gayla', 'Fox_education_Geoffrey', 'Fox_education_Gloria', 'Fox_education_Henrietta', 'Fox_education_Heriberto', 'Fox_education_Jaclyn', 'Fox_education_Jacqueline', 'Fox_education_Janina', 'Fox_education_John', 'Fox_education_Kendrick', 'Fox_education_Kim', 'Fox_education_Kris', 'Fox_education_Leona', 'Fox_education_Leota', 'Fox_education_Lesley', 'Fox_education_Lilly', 'Fox_education_Long', 'Fox_education_Louie', 'Fox_education_Marcelina', 'Fox_education_Maris', 'Fox_education_Marlana', 'Fox_education_Maureen', 'Fox_education_Melinda', 'Fox_education_Melvin', 'Fox_education_Miguelina', 'Fox_education_Nilda', 'Fox_education_Ollie', 'Fox_education_Otilia', 'Fox_education_Ray', 'Fox_education_Rosie', 'Fox_education_Rudolph', 'Fox_education_Shaun', 'Fox_education_Shawanda', 'Fox_education_Shirley', 'Fox_education_Stacia', 'Fox_education_Sterling', 'Fox_education_Suzan', 'Fox_education_Tamika', 'Fox_education_Theodore', 'Fox_education_Tonya', 'Fox_education_Vernon', 'Fox_education_Virgil', 'Fox_education_Virginia', 'Fox_education_Wendell', 'Fox_education_Willis', 'Fox_education_Yolande', 'Fox_food_Francesco', 'Fox_food_Scott', 'Fox_health_Lorena', 'Fox_lodging_Alana', 'Fox_lodging_Angla', 'Fox_lodging_Frances', 'Fox_lodging_Helen', 'Fox_lodging_Isabell', 'Fox_lodging_Jina', 'Fox_lodging_Morris', 'Fox_lodging_Stephan', 'Fox_lodging_Stephen', 'Fox_lodging_Wallace', 'Fox_lodging_Warren', 'Fox_lodging_Winifred', 'Fox_office_Alice', 'Fox_office_Bernard', 'Fox_office_Berniece', 'Fox_office_Brandy', 'Fox_office_Carson', 'Fox_office_Clayton', 'Fox_office_Demetrius', 'Fox_office_Easter', 'Fox_office_Edythe', 'Fox_office_Essie', 'Fox_office_Gaylord', 'Fox_office_Israel', 'Fox_office_Joy', 'Fox_office_Juana', 'Fox_office_Karima', 'Fox_office_Margarita', 'Fox_office_Molly', 'Fox_office_Rowena', 'Fox_office_Sheila', 'Fox_office_Susanne', 'Fox_office_Thelma', 'Fox_office_Vicki', 'Fox_office_Yong', 'Fox_office_Zachary', 'Fox_parking_Felipa', 'Fox_parking_Lynelle', 'Fox_parking_Tommie', 'Fox_public_Bart', 'Fox_public_Belle', 'Fox_public_Denny', 'Fox_public_Lauren', 'Fox_public_Martin', 'Fox_public_Rhonda', 'Fox_religion_Maurice', 'Fox_retail_Manie', 'Fox_utility_Marian', 'Fox_warehouse_Lorretta', 'Fox_warehouse_Pearl', 'Gator_assembly_Alexa', 'Gator_assembly_Bailey', 'Gator_assembly_Beryl', 'Gator_assembly_Blanca', 'Gator_assembly_Daisy', 'Gator_assembly_Elliot', 'Gator_assembly_Enid', 'Gator_assembly_Erich', 'Gator_assembly_Gene', 'Gator_assembly_Hue', 'Gator_assembly_Joni', 'Gator_assembly_Kayleigh', 'Gator_assembly_Kimberly', 'Gator_assembly_Lelia', 'Gator_assembly_Lera', 'Gator_assembly_Lilli', 'Gator_assembly_Loyce', 'Gator_assembly_Lucia', 'Gator_assembly_Marjorie', 'Gator_assembly_Maurine', 'Gator_assembly_Milton', 'Gator_assembly_Regina', 'Gator_assembly_Roy', 'Gator_assembly_Selma', 'Gator_assembly_Stacy', 'Gator_assembly_Virgie', 'Gator_office_August', 'Gator_office_Betty', 'Gator_office_Carrie', 'Gator_office_Hunter', 'Gator_office_Julie', 'Gator_office_Lisa', 'Gator_office_Lucy', 'Gator_office_Merle', 'Gator_other_Cassandra', 'Gator_other_Elfriede', 'Gator_other_Gertrude', 'Gator_other_Glen', 'Gator_other_Minda', 'Gator_other_Refugio', 'Gator_other_Reginald', 'Gator_other_Russel', 'Gator_other_Samuel', 'Gator_public_Alexandra', 'Gator_public_Beulah', 'Gator_public_Cheri', 'Gator_public_Clara', 'Gator_public_Dale', 'Gator_public_Dewey', 'Gator_public_Dionna', 'Gator_public_Erika', 'Gator_public_Everette', 'Gator_public_Geraldine', 'Gator_public_Janna', 'Gator_public_Jayme', 'Gator_public_Jolene', 'Gator_public_Kendall', 'Gator_public_Kenny', 'Gator_public_Latasha', 'Gator_public_Leroy', 'Gator_public_Lindsey', 'Gator_public_Lona', 'Gator_public_Marcie', 'Gator_public_Marissa', 'Gator_public_Maude', 'Gator_public_Natasha', 'Gator_public_Nettie', 'Gator_public_Noe', 'Gator_public_Philip', 'Gator_public_Randall', 'Gator_public_Ross', 'Gator_public_Tiffany', 'Gator_warehouse_Constance', 'Gator_warehouse_Stacie', 'Hog_assembly_Annemarie', 'Hog_assembly_Arlie', 'Hog_assembly_Colette', 'Hog_assembly_Dona', 'Hog_assembly_Edward', 'Hog_assembly_Jasmine', 'Hog_assembly_Letha', 'Hog_assembly_Maribel', 'Hog_assembly_Marilynn', 'Hog_assembly_Pedro', 'Hog_assembly_Una', 'Hog_education_Beth', 'Hog_education_Bruno', 'Hog_education_Caridad', 'Hog_education_Casandra', 'Hog_education_Cathleen', 'Hog_education_Darryl', 'Hog_education_Donnie', 'Hog_education_George', 'Hog_education_Hallie', 'Hog_education_Haywood', 'Hog_education_Janell', 'Hog_education_Jared', 'Hog_education_Jewel', 'Hog_education_Jordan', 'Hog_education_Josh', 'Hog_education_Leandro', 'Hog_education_Luvenia', 'Hog_education_Madge', 'Hog_education_Odell', 'Hog_education_Rachael', 'Hog_education_Robert', 'Hog_education_Roberto', 'Hog_education_Sonia', 'Hog_education_Wayne', 'Hog_food_Morgan', 'Hog_health_Hisako', 'Hog_health_Jenny', 'Hog_health_Kesha', 'Hog_industrial_Jay', 'Hog_industrial_Jeremy', 'Hog_industrial_Joanne', 'Hog_industrial_Mariah', 'Hog_industrial_Quentin', 'Hog_lodging_Brian', 'Hog_lodging_Celeste', 'Hog_lodging_Edgar', 'Hog_lodging_Francisco', 'Hog_lodging_Hal', 'Hog_lodging_Mauricio', 'Hog_lodging_Nikki', 'Hog_lodging_Ora', 'Hog_lodging_Retha', 'Hog_lodging_Shanti', 'Hog_lodging_Shonda', 'Hog_office_Alexis', 'Hog_office_Alisha', 'Hog_office_Almeda', 'Hog_office_Bessie', 'Hog_office_Betsy', 'Hog_office_Bill', 'Hog_office_Bryan', 'Hog_office_Buford', 'Hog_office_Byron', 'Hog_office_Candi', 'Hog_office_Carri', 'Hog_office_Catalina', 'Hog_office_Catharine', 'Hog_office_Charla', 'Hog_office_Clemencia', 'Hog_office_Concetta', 'Hog_office_Cordelia', 'Hog_office_Corey', 'Hog_office_Corie', 'Hog_office_Cornell', 'Hog_office_Cortney', 'Hog_office_Darline', 'Hog_office_Denita', 'Hog_office_Elizbeth', 'Hog_office_Elke', 'Hog_office_Elnora', 'Hog_office_Eloise', 'Hog_office_Elsy', 'Hog_office_Emmanuel', 'Hog_office_Garrett', 'Hog_office_Guadalupe', 'Hog_office_Gustavo', 'Hog_office_Joey', 'Hog_office_Josefina', 'Hog_office_Judith', 'Hog_office_Lanell', 'Hog_office_Lavon', 'Hog_office_Leanne', 'Hog_office_Leon', 'Hog_office_Lizzie', 'Hog_office_Mack', 'Hog_office_Man', 'Hog_office_Mari', 'Hog_office_Marilyn', 'Hog_office_Marlena', 'Hog_office_Marlene', 'Hog_office_Mary', 'Hog_office_Merilyn', 'Hog_office_Migdalia', 'Hog_office_Mike', 'Hog_office_Miriam', 'Hog_office_Myles', 'Hog_office_Nancie', 'Hog_office_Napoleon', 'Hog_office_Nia', 'Hog_office_Patrick', 'Hog_office_Randi', 'Hog_office_Richelle', 'Hog_office_Roger', 'Hog_office_Rolando', 'Hog_office_Sarah', 'Hog_office_Shawna', 'Hog_office_Shawnna', 'Hog_office_Sherrie', 'Hog_office_Shon', 'Hog_office_Simon', 'Hog_office_Sonny', 'Hog_office_Sung', 'Hog_office_Sydney', 'Hog_office_Terry', 'Hog_office_Thomas', 'Hog_office_Valda', 'Hog_office_Vera', 'Hog_other_Lynette', 'Hog_other_Noma', 'Hog_other_Tobias', 'Hog_parking_Antoinette', 'Hog_parking_Bernardo', 'Hog_parking_Cliff', 'Hog_parking_Jean', 'Hog_parking_Jeana', 'Hog_parking_Joan', 'Hog_parking_Marcus', 'Hog_parking_Shannon', 'Hog_public_Brad', 'Hog_public_Crystal', 'Hog_public_Gerard', 'Hog_public_Kevin', 'Hog_public_Octavia', 'Hog_science_Max', 'Hog_services_Adrianna', 'Hog_services_Joe', 'Hog_services_Kerrie', 'Hog_services_Marshall', 'Hog_warehouse_Louise', 'Hog_warehouse_Porsha', 'Hog_warehouse_Rosanna', 'Lamb_assembly_Alden', 'Lamb_assembly_Bertie', 'Lamb_assembly_Cesar', 'Lamb_assembly_Cherie', 'Lamb_assembly_Corliss', 'Lamb_assembly_Delilah', 'Lamb_assembly_Dillon', 'Lamb_assembly_Dorathy', 'Lamb_assembly_Dorothy', 'Lamb_assembly_Dudley', 'Lamb_assembly_Elinor', 'Lamb_assembly_Ethel', 'Lamb_assembly_Eugenia', 'Lamb_assembly_Isa', 'Lamb_assembly_Jerry', 'Lamb_assembly_Katelyn', 'Lamb_assembly_Kurt', 'Lamb_assembly_Librada', 'Lamb_assembly_Louis', 'Lamb_assembly_Nicole', 'Lamb_assembly_Ossie', 'Lamb_assembly_Queen', 'Lamb_assembly_Rosa', 'Lamb_assembly_Steven', 'Lamb_assembly_Tasha', 'Lamb_assembly_Tawana', 'Lamb_assembly_Theresa', 'Lamb_assembly_Walter', 'Lamb_assembly_Zita', 'Lamb_education_Aldo', 'Lamb_education_Alina', 'Lamb_education_Antonio', 'Lamb_education_Armando', 'Lamb_education_Augustine', 'Lamb_education_Bert', 'Lamb_education_Bettye', 'Lamb_education_Camille', 'Lamb_education_Carlton', 'Lamb_education_Chet', 'Lamb_education_Daniel', 'Lamb_education_Darrel', 'Lamb_education_Debby', 'Lamb_education_Dolly', 'Lamb_education_Domitila', 'Lamb_education_Dwayne', 'Lamb_education_Ellen', 'Lamb_education_Emery', 'Lamb_education_Emilie', 'Lamb_education_Eula', 'Lamb_education_Faith', 'Lamb_education_Felicia', 'Lamb_education_Felipe', 'Lamb_education_Fred', 'Lamb_education_Freddy', 'Lamb_education_Gabriel', 'Lamb_education_Gabrielle', 'Lamb_education_Garry', 'Lamb_education_Harold', 'Lamb_education_Heidi', 'Lamb_education_Hellen', 'Lamb_education_Hilary', 'Lamb_education_Hillary', 'Lamb_education_Hubert', 'Lamb_education_Hui', 'Lamb_education_Ira', 'Lamb_education_Isabelle', 'Lamb_education_Jane', 'Lamb_education_Junior', 'Lamb_education_Kasha', 'Lamb_education_Kayla', 'Lamb_education_Larissa', 'Lamb_education_Lawrence', 'Lamb_education_Lazaro', 'Lamb_education_Lemuel', 'Lamb_education_Leopoldo', 'Lamb_education_Logan', 'Lamb_education_Lucas', 'Lamb_education_Luz', 'Lamb_education_Mae', 'Lamb_education_Manuel', 'Lamb_education_Marc', 'Lamb_education_Maritza', 'Lamb_education_Marty', 'Lamb_education_Maxwell', 'Lamb_education_Mckenzie', 'Lamb_education_Moses', 'Lamb_education_Nathan', 'Lamb_education_Nichol', 'Lamb_education_Norris', 'Lamb_education_Patsy', 'Lamb_education_Phil', 'Lamb_education_Philomena', 'Lamb_education_Princess', 'Lamb_education_Randal', 'Lamb_education_Renae', 'Lamb_education_Rick', 'Lamb_education_Robin', 'Lamb_education_Rodrigo', 'Lamb_education_Ruben', 'Lamb_education_Sabrina', 'Lamb_education_Salvador', 'Lamb_education_Sara', 'Lamb_education_Stefan', 'Lamb_education_Sunny', 'Lamb_education_Sylvester', 'Lamb_education_Terina', 'Lamb_education_Traci', 'Lamb_education_Vaughn', 'Lamb_education_Wanda', 'Lamb_education_Wilbert', 'Lamb_education_Willetta', 'Lamb_education_Williams', 'Lamb_food_Sylvia', 'Lamb_health_Ken', 'Lamb_industrial_Carla', 'Lamb_industrial_Enrique', 'Lamb_industrial_Venessa', 'Lamb_industrial_Willard', 'Lamb_lodging_Burt', 'Lamb_lodging_Harley', 'Lamb_office_Bertha', 'Lamb_office_Caitlin', 'Lamb_office_Callie', 'Lamb_office_Corine', 'Lamb_office_Donita', 'Lamb_office_Gerardo', 'Lamb_office_Jo', 'Lamb_office_Joanna', 'Lamb_office_Kent', 'Lamb_office_Kerry', 'Lamb_office_Maggie', 'Lamb_office_Peggy', 'Lamb_office_Raymond', 'Lamb_office_Stefani', 'Lamb_office_Vasiliki', 'Lamb_office_Velma', 'Lamb_office_William', 'Lamb_other_Katharina', 'Lamb_other_Minerva', 'Lamb_public_Angeline', 'Lamb_public_Bradly', 'Lamb_public_Grace', 'Lamb_public_Gracie', 'Lamb_public_Nyla', 'Lamb_public_Vania', 'Lamb_warehouse_Allan', 'Moose_education_Abbie', 'Moose_education_Clark', 'Moose_education_Diane', 'Moose_education_Florence', 'Moose_education_Gladys', 'Moose_education_Leland', 'Moose_education_Lori', 'Moose_education_Maria', 'Moose_education_Marina', 'Moose_education_Marlon', 'Moose_education_Rene', 'Moose_education_Ricardo', 'Moose_education_Sasha', 'Mouse_health_Buddy', 'Mouse_health_Estela', 'Mouse_health_Ileana', 'Mouse_health_Justin', 'Mouse_health_Modesto', 'Mouse_lodging_Vicente', 'Mouse_science_Micheal', 'Panther_assembly_Carrol', 'Panther_assembly_David', 'Panther_assembly_Denice', 'Panther_assembly_Gwyneth', 'Panther_assembly_Pamella', 'Panther_education_Alecia', 'Panther_education_Annetta', 'Panther_education_Aurora', 'Panther_education_Cleopatra', 'Panther_education_Diann', 'Panther_education_Edna', 'Panther_education_Emily', 'Panther_education_Enriqueta', 'Panther_education_Genevieve', 'Panther_education_Gina', 'Panther_education_Hugh', 'Panther_education_Ivan', 'Panther_education_Janis', 'Panther_education_Jerome', 'Panther_education_Jonathan', 'Panther_education_Karri', 'Panther_education_Mattie', 'Panther_education_Misty', 'Panther_education_Mohammad', 'Panther_education_Neal', 'Panther_education_Quintin', 'Panther_education_Rosalie', 'Panther_education_Scarlett', 'Panther_education_Shelton', 'Panther_education_Sophia', 'Panther_education_Teofila', 'Panther_education_Tina', 'Panther_education_Vincent', 'Panther_education_Violet', 'Panther_education_Zelda', 'Panther_lodging_Alita', 'Panther_lodging_Anastasia', 'Panther_lodging_Awilda', 'Panther_lodging_Blaine', 'Panther_lodging_Cora', 'Panther_lodging_Cornelia', 'Panther_lodging_Dianna', 'Panther_lodging_Edison', 'Panther_lodging_Edmond', 'Panther_lodging_Else', 'Panther_lodging_Fausto', 'Panther_lodging_Floyd', 'Panther_lodging_Gale', 'Panther_lodging_Hattie', 'Panther_lodging_Jana', 'Panther_lodging_Janice', 'Panther_lodging_Jorge', 'Panther_lodging_Kaitlin', 'Panther_lodging_Kara', 'Panther_lodging_Kirk', 'Panther_lodging_Marisol', 'Panther_lodging_Myrtle', 'Panther_lodging_Russell', 'Panther_lodging_Sonja', 'Panther_lodging_Teresita', 'Panther_lodging_Tracie', 'Panther_lodging_Willa', 'Panther_office_Antonette', 'Panther_office_Brent', 'Panther_office_Catherine', 'Panther_office_Christian', 'Panther_office_Christin', 'Panther_office_Clementine', 'Panther_office_Danica', 'Panther_office_Garth', 'Panther_office_Graham', 'Panther_office_Hannah', 'Panther_office_Jeane', 'Panther_office_Jesus', 'Panther_office_Karla', 'Panther_office_Kristen', 'Panther_office_Larry', 'Panther_office_Lauretta', 'Panther_office_Lavinia', 'Panther_office_Lois', 'Panther_office_Otto', 'Panther_office_Patti', 'Panther_office_Ruthie', 'Panther_office_Shauna', 'Panther_office_Taryn', 'Panther_office_Valarie', 'Panther_other_Bethel', 'Panther_other_Bettie', 'Panther_other_Lucina', 'Panther_other_Lula', 'Panther_other_Tyrone', 'Panther_parking_Adela', 'Panther_parking_Alaina', 'Panther_parking_Asia', 'Panther_parking_Charlene', 'Panther_parking_Jody', 'Panther_parking_Lorriane', 'Panther_parking_Mellissa', 'Panther_parking_Stanley', 'Panther_retail_Felix', 'Panther_retail_Gilbert', 'Panther_retail_Kristina', 'Panther_retail_Lester', 'Panther_retail_Rachel', 'Panther_retail_Romeo', 'Peacock_assembly_Dena', 'Peacock_assembly_Mamie', 'Peacock_assembly_Russ', 'Peacock_assembly_Socorro', 'Peacock_education_Anita', 'Peacock_education_Bianca', 'Peacock_education_Dustin', 'Peacock_education_Forest', 'Peacock_education_Gilberto', 'Peacock_education_Joshua', 'Peacock_education_Karl', 'Peacock_education_Karyn', 'Peacock_education_Lucie', 'Peacock_education_Lyle', 'Peacock_education_Ophelia', 'Peacock_education_Pasquale', 'Peacock_education_Patience', 'Peacock_education_Robbie', 'Peacock_education_Shelly', 'Peacock_education_Weldon', 'Peacock_education_Yolanda', 'Peacock_lodging_Chloe', 'Peacock_lodging_Francesca', 'Peacock_lodging_Jamaal', 'Peacock_lodging_James', 'Peacock_lodging_Lou', 'Peacock_lodging_Mathew', 'Peacock_lodging_Matthew', 'Peacock_lodging_Nova', 'Peacock_lodging_Sergio', 'Peacock_lodging_Terrie', 'Peacock_lodging_Wes', 'Peacock_office_Annie', 'Peacock_office_Burl', 'Peacock_office_Dara', 'Peacock_office_Effie', 'Peacock_office_Elton', 'Peacock_office_Glenn', 'Peacock_office_Jonathon', 'Peacock_office_Julian', 'Peacock_office_Major', 'Peacock_office_Naomi', 'Peacock_office_Norman', 'Peacock_public_Kelvin', 'Peacock_public_Linda', 'Rat_assembly_Adolfo', 'Rat_assembly_Aisha', 'Rat_assembly_Alex', 'Rat_assembly_Archie', 'Rat_assembly_Aubrey', 'Rat_assembly_Cristina', 'Rat_assembly_Damaris', 'Rat_assembly_Deandre', 'Rat_assembly_Don', 'Rat_assembly_Donny', 'Rat_assembly_Dovie', 'Rat_assembly_Erwin', 'Rat_assembly_Ezequiel', 'Rat_assembly_Francine', 'Rat_assembly_Frieda', 'Rat_assembly_Gerald', 'Rat_assembly_Gwen', 'Rat_assembly_Horace', 'Rat_assembly_Ida', 'Rat_assembly_Jamie', 'Rat_assembly_Jannie', 'Rat_assembly_Jennie', 'Rat_assembly_Kaitlyn', 'Rat_assembly_Karen', 'Rat_assembly_Kelley', 'Rat_assembly_Kenya', 'Rat_assembly_Kimberley', 'Rat_assembly_Kristine', 'Rat_assembly_Kristy', 'Rat_assembly_Lillie', 'Rat_assembly_Michel', 'Rat_assembly_Mirta', 'Rat_assembly_Monica', 'Rat_assembly_Myrna', 'Rat_assembly_Pam', 'Rat_assembly_Pauline', 'Rat_assembly_Rolland', 'Rat_assembly_Rosemarie', 'Rat_assembly_Ruth', 'Rat_assembly_Silvia', 'Rat_assembly_Suzanne', 'Rat_assembly_Teddy', 'Rat_assembly_Teodoro', 'Rat_assembly_Trent', 'Rat_assembly_Trudy', 'Rat_assembly_Victorina', 'Rat_assembly_Viola', 'Rat_education_Abigail', 'Rat_education_Adell', 'Rat_education_Adrian', 'Rat_education_Alfonso', 'Rat_education_Alfred', 'Rat_education_Alonzo', 'Rat_education_Alyson', 'Rat_education_Angelica', 'Rat_education_Barbara', 'Rat_education_Barney', 'Rat_education_Beverly', 'Rat_education_Brett', 'Rat_education_Bryant', 'Rat_education_Calvin', 'Rat_education_Candida', 'Rat_education_Carmela', 'Rat_education_Cecil', 'Rat_education_Cedric', 'Rat_education_Chance', 'Rat_education_Cinthia', 'Rat_education_Colin', 'Rat_education_Conrad', 'Rat_education_Dana', 'Rat_education_Dann', 'Rat_education_Davis', 'Rat_education_Deanna', 'Rat_education_Debra', 'Rat_education_Denise', 'Rat_education_Dianne', 'Rat_education_Donnell', 'Rat_education_Dreama', 'Rat_education_Earl', 'Rat_education_Earnest', 'Rat_education_Edmund', 'Rat_education_Eleonora', 'Rat_education_Elisa', 'Rat_education_Elsie', 'Rat_education_Esther', 'Rat_education_Everett', 'Rat_education_Fernando', 'Rat_education_Francisca', 'Rat_education_Gricelda', 'Rat_education_Guillermo', 'Rat_education_Humberto', 'Rat_education_Imelda', 'Rat_education_Irma', 'Rat_education_Jacob', 'Rat_education_Jame', 'Rat_education_Jeanne', 'Rat_education_Jena', 'Rat_education_Jesse', 'Rat_education_Kandice', 'Rat_education_Kathryn', 'Rat_education_Keith', 'Rat_education_Kelsey', 'Rat_education_Kristie', 'Rat_education_Lee', 'Rat_education_Leonardo', 'Rat_education_Lincoln', 'Rat_education_Liz', 'Rat_education_Lonnie', 'Rat_education_Lynn', 'Rat_education_Mac', 'Rat_education_Marcos', 'Rat_education_Marianna', 'Rat_education_Maricela', 'Rat_education_Marvin', 'Rat_education_Matt', 'Rat_education_Mavis', 'Rat_education_Meghan', 'Rat_education_Milagros', 'Rat_education_Moises', 'Rat_education_Mona', 'Rat_education_Morton', 'Rat_education_Mose', 'Rat_education_Nellie', 'Rat_education_Nona', 'Rat_education_Nydia', 'Rat_education_Pat', 'Rat_education_Patty', 'Rat_education_Paula', 'Rat_education_Penny', 'Rat_education_Renee', 'Rat_education_Robyn', 'Rat_education_Rogelio', 'Rat_education_Romana', 'Rat_education_Rosalyn', 'Rat_education_Roxanne', 'Rat_education_Royal', 'Rat_education_Salvatore', 'Rat_education_Shellie', 'Rat_education_Sherwood', 'Rat_education_Sina', 'Rat_education_Stuart', 'Rat_education_Susana', 'Rat_education_Tania', 'Rat_education_Terese', 'Rat_education_Theo', 'Rat_education_Tim', 'Rat_education_Tristan', 'Rat_education_Ulrike', 'Rat_education_Verna', 'Rat_education_Veronica', 'Rat_education_Vicky', 'Rat_education_Willie', 'Rat_education_Willy', 'Rat_education_Wilmer', 'Rat_education_Winnie', 'Rat_education_Yu', 'Rat_education_Zina', 'Rat_education_Zoe', 'Rat_health_Ann', 'Rat_health_Gaye', 'Rat_health_Guy', 'Rat_health_Mildred', 'Rat_health_Rosaria', 'Rat_health_Shane', 'Rat_health_Tanya', 'Rat_lodging_Ardell', 'Rat_lodging_Ben', 'Rat_lodging_Christine', 'Rat_lodging_Dwight', 'Rat_lodging_Jeannette', 'Rat_lodging_Lakisha', 'Rat_lodging_Lorenzo', 'Rat_lodging_Lucille', 'Rat_lodging_Marguerite', 'Rat_lodging_Marion', 'Rat_lodging_Ted', 'Rat_office_Adele', 'Rat_office_Annis', 'Rat_office_Arron', 'Rat_office_Ashlee', 'Rat_office_Avis', 'Rat_office_Chris', 'Rat_office_Colby', 'Rat_office_Craig', 'Rat_office_Jacinta', 'Rat_office_Jamal', 'Rat_office_Jeannie', 'Rat_office_Jessica', 'Rat_office_Jill', 'Rat_office_Kasey', 'Rat_office_Lora', 'Rat_office_Loyd', 'Rat_office_Mei', 'Rat_office_Olivia', 'Rat_office_Ramiro', 'Rat_office_Randy', 'Rat_office_Ronald', 'Rat_office_Rosemary', 'Rat_office_Sammy', 'Rat_office_Tracy', 'Rat_other_Al', 'Rat_other_Daphne', 'Rat_other_Hazel', 'Rat_other_Lan', 'Rat_parking_Ronnie', 'Rat_public_Alanna', 'Rat_public_Alexander', 'Rat_public_Allie', 'Rat_public_Allyson', 'Rat_public_Amber', 'Rat_public_Andrea', 'Rat_public_Angelina', 'Rat_public_Angle', 'Rat_public_Becky', 'Rat_public_Berry', 'Rat_public_Bronwyn', 'Rat_public_Carole', 'Rat_public_Caroline', 'Rat_public_Chrissy', 'Rat_public_Clyde', 'Rat_public_Corinne', 'Rat_public_Courtney', 'Rat_public_Dalia', 'Rat_public_Damon', 'Rat_public_Darren', 'Rat_public_Deidre', 'Rat_public_Desiree', 'Rat_public_Dexter', 'Rat_public_Duane', 'Rat_public_Elmira', 'Rat_public_Emilee', 'Rat_public_Faye', 'Rat_public_Fern', 'Rat_public_Frank', 'Rat_public_Frederick', 'Rat_public_Fredrick', 'Rat_public_Grover', 'Rat_public_Helena', 'Rat_public_Hortencia', 'Rat_public_Isabel', 'Rat_public_Jason', 'Rat_public_Joann', 'Rat_public_Johnna', 'Rat_public_Johnny', 'Rat_public_Josiah', 'Rat_public_Julieann', 'Rat_public_Kathleen', 'Rat_public_Kelle', 'Rat_public_Kelly', 'Rat_public_Kermit', 'Rat_public_Kimber', 'Rat_public_Laura', 'Rat_public_Laurie', 'Rat_public_Laverne', 'Rat_public_Lea', 'Rat_public_Leo', 'Rat_public_Leta', 'Rat_public_Lloyd', 'Rat_public_Loretta', 'Rat_public_Lynda', 'Rat_public_Mabel', 'Rat_public_Marcellus', 'Rat_public_Margart', 'Rat_public_Mark', 'Rat_public_Maxine', 'Rat_public_Michael', 'Rat_public_Michelle', 'Rat_public_Muriel', 'Rat_public_Nancy', 'Rat_public_Neil', 'Rat_public_Nell', 'Rat_public_Nelson', 'Rat_public_Norene', 'Rat_public_Percy', 'Rat_public_Ramon', 'Rat_public_Ramona', 'Rat_public_Roberta', 'Rat_public_Roma', 'Rat_public_Sade', 'Rat_public_Sana', 'Rat_public_Sean', 'Rat_public_Shanta', 'Rat_public_Sharonda', 'Rat_public_Sharron', 'Rat_public_Stacey', 'Rat_public_Stella', 'Rat_public_Sue', 'Rat_public_Tammara', 'Rat_public_Tamra', 'Rat_public_Tommy', 'Rat_public_Toya', 'Rat_public_Tricia', 'Rat_public_Ulysses', 'Rat_public_Vickie', 'Rat_public_Wilbur', 'Rat_public_Wilma', 'Rat_public_Yessenia', 'Rat_public_Yetta', 'Rat_religion_Kathy', 'Rat_retail_Jeffrey', 'Rat_warehouse_Breanna', 'Rat_warehouse_Doretta', 'Rat_warehouse_Eloisa', 'Rat_warehouse_Maegan', 'Rat_warehouse_Shari', 'Robin_assembly_Colleen', 'Robin_education_Audrea', 'Robin_education_Billi', 'Robin_education_Cecilia', 'Robin_education_Della', 'Robin_education_Derick', 'Robin_education_Derrick', 'Robin_education_Jasper', 'Robin_education_Julius', 'Robin_education_Karyl', 'Robin_education_Kiera', 'Robin_education_Kristopher', 'Robin_education_Lashandra', 'Robin_education_Leslie', 'Robin_education_Lizbeth', 'Robin_education_Madeline', 'Robin_education_Margarito', 'Robin_education_Megan', 'Robin_education_Mercedes', 'Robin_education_So', 'Robin_education_Takako', 'Robin_education_Terrance', 'Robin_education_Zenia', 'Robin_lodging_Armand', 'Robin_lodging_Celia', 'Robin_lodging_Donna', 'Robin_lodging_Dorthy', 'Robin_lodging_Elmer', 'Robin_lodging_Janie', 'Robin_lodging_Oliva', 'Robin_lodging_Phillip', 'Robin_lodging_Pricilla', 'Robin_lodging_Renea', 'Robin_office_Addie', 'Robin_office_Adolph', 'Robin_office_Antonina', 'Robin_office_Dina', 'Robin_office_Donald', 'Robin_office_Erma', 'Robin_office_Gayle', 'Robin_office_Lindsay', 'Robin_office_Maryann', 'Robin_office_Sammie', 'Robin_office_Saul', 'Robin_office_Serena', 'Robin_office_Shirlene', 'Robin_office_Soledad', 'Robin_office_Victor', 'Robin_office_Wai', 'Robin_office_Zelma', 'Robin_public_Cami', 'Robin_public_Carolina', 'Shrew_office_Doris', 'Shrew_office_Doug', 'Shrew_office_Ila', 'Shrew_office_Katherine', 'Shrew_office_Kenneth', 'Shrew_office_Lin', 'Shrew_office_Nora', 'Shrew_office_Rose', 'Shrew_office_Sherill', 'Swan_unknown_Allison', 'Swan_unknown_Andres', 'Swan_unknown_Bette', 'Swan_unknown_Christoper', 'Swan_unknown_Douglas', 'Swan_unknown_Esteban', 'Swan_unknown_Fabian', 'Swan_unknown_Ike', 'Swan_unknown_Isaiah', 'Swan_unknown_Jan', 'Swan_unknown_Jerold', 'Swan_unknown_Noelia', 'Swan_unknown_Raquel', 'Swan_unknown_Reyna', 'Swan_unknown_Rocco', 'Swan_unknown_Rudy', 'Swan_unknown_Tom', 'Swan_unknown_Valeria', 'Swan_unknown_Wendy', 'Wolf_assembly_Elaine', 'Wolf_assembly_Sallie', 'Wolf_education_Anisa', 'Wolf_education_Arnulfo', 'Wolf_education_Bobby', 'Wolf_education_Cheryl', 'Wolf_education_Clarissa', 'Wolf_education_Cody', 'Wolf_education_Dolores', 'Wolf_education_Dorris', 'Wolf_education_Eulalia', 'Wolf_education_Joaquin', 'Wolf_education_Josefa', 'Wolf_education_Katie', 'Wolf_education_Laurinda', 'Wolf_education_Loren', 'Wolf_education_Miguel', 'Wolf_education_Roderick', 'Wolf_education_Tammie', 'Wolf_education_Tori', 'Wolf_education_Ursula', 'Wolf_education_Vivian', 'Wolf_office_Bobbie', 'Wolf_office_Cary', 'Wolf_office_Darleen', 'Wolf_office_Elisabeth', 'Wolf_office_Emanuel', 'Wolf_office_Haydee', 'Wolf_office_Joana', 'Wolf_office_Nadia', 'Wolf_office_Rochelle', 'Wolf_public_Norma', 'Wolf_retail_Harriett', 'Wolf_retail_Marcella', 'Wolf_retail_Toshia', 'Wolf_science_Alfreda']

Out[ ]:
levels lags params mean_absolute_error__weighted_average mean_absolute_error__average mean_absolute_error__pooling n_estimators max_depth min_data_in_leaf learning_rate feature_fraction max_bin reg_alpha reg_lambda
0 [Bear_assembly_Angel, Bear_assembly_Beatrice, ... [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14... {'n_estimators': 600, 'max_depth': 6, 'min_dat... 265.557287 265.557287 265.557287 600.0 6.0 188.0 0.159019 0.6 100.0 0.9 0.5
1 [Bear_assembly_Angel, Bear_assembly_Beatrice, ... [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14... {'n_estimators': 500, 'max_depth': 5, 'min_dat... 266.996235 266.996235 266.996235 500.0 5.0 227.0 0.163008 0.6 100.0 1.0 0.5
2 [Bear_assembly_Angel, Bear_assembly_Beatrice, ... [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14... {'n_estimators': 400, 'max_depth': 5, 'min_dat... 267.914068 267.914068 267.914068 400.0 5.0 437.0 0.132723 0.6 100.0 0.5 0.6

Backtesting con datos de test

In [17]:
# Backtesting
# ==============================================================================
cv = TimeSeriesFold(
    initial_train_size = 608 + 60, # Entreanmiento + validación
    steps              = 7,
    refit              = False
)
metrics, predictions = backtesting_forecaster_multiseries(
                          forecaster        = forecaster,
                          series            = series_dict,
                          exog              = exog_dict,
                          cv                = cv,
                          metric            = 'mean_absolute_error',
                          verbose           = False,
                          show_progress     = True,
                          suppress_warnings = True
                      )

display(predictions.head())
display(metrics)
Bear_assembly_Angel Bear_assembly_Beatrice Bear_assembly_Danial Bear_assembly_Diana Bear_assembly_Genia Bear_assembly_Harry Bear_assembly_Jose Bear_assembly_Roxy Bear_assembly_Ruby Bear_education_Alfredo ... Wolf_office_Emanuel Wolf_office_Haydee Wolf_office_Joana Wolf_office_Nadia Wolf_office_Rochelle Wolf_public_Norma Wolf_retail_Harriett Wolf_retail_Marcella Wolf_retail_Toshia Wolf_science_Alfreda
2017-10-30 10592.607229 1032.696072 4056.629515 27.406207 7486.302404 439.881083 6879.087015 552.453275 1655.853841 38.709664 ... 361.337366 210.060744 321.831843 1673.708462 375.908451 3402.690970 1150.926131 161.710705 1500.964495 1743.394140
2017-10-31 10472.621539 1023.933848 3820.851897 42.918610 7007.321492 417.956425 6512.836993 485.334172 1482.183104 96.075704 ... 395.653458 188.468579 283.600247 1731.282896 410.224543 3771.871009 1238.155550 148.884883 1655.542029 1836.705983
2017-11-01 11252.961732 1165.450647 4115.225136 54.892865 7428.776862 405.362192 6823.843641 444.216678 1540.435683 124.838844 ... 442.972649 221.391340 309.578692 1770.081731 447.536830 4033.487825 1119.896803 189.380181 1754.014707 1872.483485
2017-11-02 10372.171087 1074.100181 4222.252035 57.811276 7436.383635 410.521974 6778.676617 473.724683 1708.998342 133.607039 ... 380.947298 191.304045 266.644451 1778.210035 431.739530 4007.392813 1096.840039 139.286319 1625.423470 1834.694876
2017-11-03 10291.017747 1235.674033 4151.233121 61.884943 7836.754401 410.308345 6481.923827 502.925866 1664.724787 113.980118 ... 383.486604 229.516185 307.146298 1797.058633 407.666263 3863.285633 1096.096599 209.491574 1610.787386 1819.461818

5 rows × 1578 columns

levels mean_absolute_error
0 Bear_assembly_Angel 1242.274464
1 Bear_assembly_Beatrice 209.284077
2 Bear_assembly_Danial 311.835036
3 Bear_assembly_Diana 23.336818
4 Bear_assembly_Genia 643.629645
... ... ...
1576 Wolf_retail_Toshia 536.740018
1577 Wolf_science_Alfreda 160.150092
1578 average 336.462095
1579 weighted_average 336.462095
1580 pooling 336.462095

1581 rows × 2 columns

In [18]:
# Agregación de métricas para todos los edificios
# ==============================================================================
average_metric_all_buildings = metrics.query("levels == 'average'")["mean_absolute_error"].item()
errors_all_buildings = (
    predictions
    - data.pivot(
        columns="building_id",
        values="meter_reading",
    ).loc[predictions.index, predictions.columns]
)
sum_abs_errors_all_buildings = errors_all_buildings.abs().sum().sum()
sum_bias_all_buildings = errors_all_buildings.sum().sum()
print(f"Average mean absolute error for all buildings: {average_metric_all_buildings:.0f}")
print(f"Sum of absolute errors for all buildings (x 10,000): {sum_abs_errors_all_buildings/10000:.0f}")
print(f"Bias (x 10,000): {sum_bias_all_buildings/10000:.0f}")
Average mean absolute error for all buildings: 336
Sum of absolute errors for all buildings (x 10,000): 3345
Bias (x 10,000): 145
In [19]:
# Gráfico de predicciones vs reales para dos edificios aleatorios
# ==============================================================================
rng = np.random.default_rng(14793)
n_buildings = 2
selected_buildings = rng.choice(data['building_id'].unique(), size=n_buildings, replace=False)

fig, axs = plt.subplots(n_buildings, 1, figsize=(7, 4.5), sharex=True)
axs = axs.flatten()

for i, building in enumerate(selected_buildings):
    data.query("building_id == @building").loc[predictions.index, 'meter_reading'].plot(ax=axs[i], label='test')
    predictions[building].plot(ax=axs[i], label='predictions')
    axs[i].set_title(f"Building {building}", fontsize=10)
    axs[i].set_xlabel("")
    axs[i].legend()

fig.tight_layout()
plt.show();

Selección de predictores

La selección de predictores es el proceso de seleccionar un subconjunto de predictores relevantes (variables) para su uso en la construcción del modelo. Las técnicas de selección de predictores se utilizan por varias razones: para simplificar los modelos y hacerlos más fáciles de interpretar, para reducir el tiempo de entrenamiento, para evitar los problemas de dimensionalidad, para mejorar la generalización reduciendo el sobreajuste (formalmente, la reducción de la varianza), entre otros.

Skforecast es compatible con los métodos de selección implementados en scikit-learn. Existen varios métodos de selección de características, pero los más comunes son:

  • Recursive feature elimination (RFE)

  • Sequential Feature Selection (SFS)

  • Feature selection based on threshold (SelectFromModel)

💡 Tip

La selección de predictores es una herramienta poderosa para mejorar el rendimiento de los modelos de machine learning. Sin embargo, es computacionalmente costosa y puede llevar tiempo. Dado que el objetivo es encontrar el mejor subconjunto de variables, no el mejor modelo, no es necesario utilizar todo el conjunto de datos o un modelo muy complejo. En su lugar, se recomienda utilizar un pequeño subconjunto de datos y un modelo simple. Una vez que se haya identificado el mejor subconjunto de variables, el modelo puede entrenarse utilizando todo el conjunto de datos y una configuración más compleja.
In [ ]:
# Selección de predictores
# ==============================================================================
regressor = LGBMRegressor(n_estimators=100, max_depth=5, random_state=15926, verbose=-1)
selector = RFECV(estimator=regressor, step=1, cv=3, n_jobs=1)
selected_lags, selected_window_features, selected_exog = select_features_multiseries(
    forecaster      = forecaster,
    selector        = selector,
    series          = {k: v.loc[:end_validation,] for k, v in series_dict.items()},
    exog            = {k: v.loc[:end_validation, exog_features] for k, v in exog_dict.items()},
    select_only     = None,
    force_inclusion = None,
    subsample       = 0.2,
    random_state    = 123,
    verbose         = True,
)
/home/ubuntu/anaconda3/envs/skforecast_14_py12/lib/python3.12/site-packages/skforecast/recursive/_forecaster_recursive_multiseries.py:1077: MissingValuesWarning: NaNs detected in `X_train`. Some regressors do not allow NaN values during training. If you want to drop them, set `forecaster.dropna_from_series = True`. 
 You can suppress this warning using: warnings.simplefilter('ignore', category=MissingValuesWarning)
  warnings.warn(
Recursive feature elimination (RFECV)
-------------------------------------
Total number of records available: 959424
Total number of records used for feature selection: 191884
Number of features available: 83
    Lags            (n=62)
    Window features (n=3)
    Exog            (n=18)
Number of features selected: 56
    Lags            (n=39) : [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 21, 22, 23, 24, 27, 28, 30, 33, 35, 42, 43, 44, 47, 49, 50, 51, 54, 56, 60, 62]
    Window features (n=3) : ['roll_mean_7', 'roll_min_7', 'roll_max_7']
    Exog            (n=14) : ['sub_primaryspaceusage', 'timezone', 'sqm', 'airTemperature', 'cloudCoverage', 'dewTemperature', 'seaLvlPressure', 'windDirection', 'windSpeed', 'day_of_week_sin', 'day_of_week_cos', 'week_sin', 'week_cos', 'month_cos']
In [26]:
# Backtesting del forecaster con predictores seleccionados
# ==============================================================================
forecaster = ForecasterRecursiveMultiSeries(
                regressor          = LGBMRegressor(**best_params, random_state=8520, verbose=-1),
                lags               = selected_lags,
                window_features    = window_features,
                transformer_series = None,
                transformer_exog   = transformer_exog,
                fit_kwargs         = {'categorical_feature': categorical_features},
                encoding           = "ordinal"
            )
cv = TimeSeriesFold(
    initial_train_size = 608 + 60, # Entreanmiento + validación
    steps              = 7,
    refit              = False
)
metrics, predictions = backtesting_forecaster_multiseries(
                          forecaster        = forecaster,
                          series            = series_dict,
                          exog              = {k: v[exog_features] for k, v in exog_dict.items()},
                          cv                = cv,
                          metric            = 'mean_absolute_error',
                          verbose           = False,
                          show_progress     = True,
                          suppress_warnings = True
                      )

display(predictions.head())
display(metrics)
Bear_assembly_Angel Bear_assembly_Beatrice Bear_assembly_Danial Bear_assembly_Diana Bear_assembly_Genia Bear_assembly_Harry Bear_assembly_Jose Bear_assembly_Roxy Bear_assembly_Ruby Bear_education_Alfredo ... Wolf_office_Emanuel Wolf_office_Haydee Wolf_office_Joana Wolf_office_Nadia Wolf_office_Rochelle Wolf_public_Norma Wolf_retail_Harriett Wolf_retail_Marcella Wolf_retail_Toshia Wolf_science_Alfreda
2017-10-30 9626.737931 986.987146 4115.730889 28.355736 7698.815526 522.436206 6587.544940 545.660487 1725.385404 54.189837 ... 335.216997 170.143318 377.500042 1664.675830 395.541280 3411.830330 983.863534 166.663061 1319.386789 1776.367451
2017-10-31 10305.685639 989.391878 4008.593510 46.501090 7500.576527 501.195886 6627.187744 518.106346 1617.947843 102.646870 ... 350.736801 170.632769 347.939104 1778.520292 401.715195 3866.938793 1160.708035 153.027731 1703.401200 1762.124748
2017-11-01 11580.318132 1010.341555 4406.009428 53.277630 8518.103332 471.589541 7556.909480 539.607718 1651.703194 141.689303 ... 378.593324 185.523419 363.312809 1787.603859 412.526721 4146.799801 1105.229493 163.327361 1850.158677 1827.878745
2017-11-02 10989.821026 1052.330060 4468.847894 61.388218 8772.167019 466.063640 7321.600103 532.769989 1669.995231 162.281039 ... 353.901135 163.600344 343.479467 1696.069030 387.834532 3890.578581 1168.790868 150.456515 1629.042965 1816.280414
2017-11-03 10813.034611 1064.989588 4216.845915 72.815829 8476.272829 464.470173 6967.792821 610.780145 1706.882855 173.805203 ... 371.698302 193.065103 361.759688 1651.279474 405.204180 3806.558155 1110.789100 188.218057 1549.532986 1772.846319

5 rows × 1578 columns

levels mean_absolute_error
0 Bear_assembly_Angel 1072.303860
1 Bear_assembly_Beatrice 215.359461
2 Bear_assembly_Danial 313.559787
3 Bear_assembly_Diana 45.597033
4 Bear_assembly_Genia 579.565743
... ... ...
1576 Wolf_retail_Toshia 478.331909
1577 Wolf_science_Alfreda 192.194401
1578 average 344.540955
1579 weighted_average 344.540955
1580 pooling 344.540955

1581 rows × 2 columns

Se ha conseguido reducir el número de predictores sin que el rendimiento del modelo no se ve comprometido. Esto permite simplificar el modelo y acelera el entrenamiento.

Clustering series temporales

La idea que hay detrás de modelar varias series al mismo tiempo es poder capturar los patrones principales que rigen dichas series, reduciendo así el impacto del ruido que pueda haber en cada una de ellas. Esto significa que las series que se comportan de manera similar pueden beneficiarse de ser modelizadas juntas. Una forma de identificar posibles grupos de series es realizar un estudio de +cluatering antes de modelizarlas. Si como resultado del clustering* se identifican grupos claros, es apropiado modelar cada uno de ellos por separado.

El clustering es una técnica de análisis no supervisado que agrupa un conjunto de observaciones en clústeres que contienen observaciones consideradas homogéneas, mientras que las observaciones en diferentes clústeres se consideran heterogéneas. Los algoritmos que agrupan series temporales se pueden dividir en dos grupos: aquellos que utilizan una transformación para crear variables antes de agrupar (clustering de series temporales basado en características) y aquellos que trabajan directamente en las series temporales (medidas de distancia elástica).

  • Clustering basado en características de series temporales: se extraen variables que describen las características estructurales de cada serie temporal y luego se introducen en algoritmos de clustering. Estas variables se obtienen aplicando operaciones estadísticas que capturan mejor las características subyacentes: tendencia, estacionalidad, periodicidad, correlación serial, asimetría, curtosis, caos, no linealidad y auto-similitud.

  • Medidas de distancia elástica: este enfoque trabaja directamente en las series temporales, ajustando o «reajustando» las series en comparación con otras. La medida más conocida de esta familia es el Dynamic Time Warping (DTW).

Para un ejemplo detallado de cómo el clustering de series temporales puede mejorar los modelos de forecasting, consulte Modelos de forecasting globales: Análisis comparativo de modelos de una y múltiples series.

Información de sesión

In [ ]:
import session_info
session_info.show(html=False)
-----
feature_engine      1.8.1
lightgbm            4.5.0
matplotlib          3.9.2
numpy               2.0.2
optuna              3.6.1
pandas              2.2.3
session_info        1.0.0
skforecast          0.14.0
sklearn             1.5.1
-----
IPython             8.27.0
jupyter_client      8.6.3
jupyter_core        5.7.2
notebook            6.4.12
-----
Python 3.12.5 | packaged by Anaconda, Inc. | (main, Sep 12 2024, 18:27:27) [GCC 11.2.0]
Linux-5.15.0-1071-aws-x86_64-with-glibc2.31
-----
Session information updated at 2024-11-06 16:12

Citation

How to cite this document

If you use this document or any part of it, please acknowledge the source, thank you!

Forecasting escalable: modelado de mil series temporales con un único modelo global por Joaquín Amat Rodrigo y Javier Escobar Ortiz, disponible bajo una licencia Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0 DEED) en https://www.cienciadedatos.net/documentos/py59-modelos-forecasting-escalables.html

¿Cómo citar skforecast?

Si utilizas skforecast en tu investigación o publicación, te lo agradeceríamos mucho que lo cites. ¡Muchas gracias!

Zenodo:

Amat Rodrigo, Joaquin, & Escobar Ortiz, Javier. (2024). skforecast (v0.14.0). Zenodo. https://doi.org/10.5281/zenodo.8382788

APA:

Amat Rodrigo, J., & Escobar Ortiz, J. (2024). skforecast (Version 0.14.0) [Computer software]. https://doi.org/10.5281/zenodo.8382788

BibTeX:

@software{skforecast, author = {Amat Rodrigo, Joaquin and Escobar Ortiz, Javier}, title = {skforecast}, version = {0.14.0}, month = {11}, year = {2024}, license = {BSD-3-Clause}, url = {https://skforecast.org/}, doi = {10.5281/zenodo.8382788} }


¿Te ha gustado el artículo? Tu ayuda es importante

Mantener un sitio web tiene unos costes elevados, tu contribución me ayudará a seguir generando contenido divulgativo gratuito. ¡Muchísimas gracias! 😊


Creative Commons Licence
Este documento creado por Joaquín Amat Rodrigo y Javier Escobar Ortiz tiene licencia Attribution-NonCommercial-ShareAlike 4.0 International.

Se permite:

  • Compartir: copiar y redistribuir el material en cualquier medio o formato.

  • Adaptar: remezclar, transformar y crear a partir del material.

Bajo los siguientes términos:

  • Atribución: Debes otorgar el crédito adecuado, proporcionar un enlace a la licencia e indicar si se realizaron cambios. Puedes hacerlo de cualquier manera razonable, pero no de una forma que sugiera que el licenciante te respalda o respalda tu uso.

  • NoComercial: No puedes utilizar el material para fines comerciales.

  • CompartirIgual: Si remezclas, transformas o creas a partir del material, debes distribuir tus contribuciones bajo la misma licencia que el original.