• Introduction
    • ARAR vs. ARIMA
  • ARAR model theory
    • Memory Shortening
    • Subset Autoregressive Model:
    • Forecasting
  • Libraries and data
  • ARAR
  • ForecasterStats
  • Prediction
  • Backtesting
  • Session information
  • Citation


More about forecasting in cienciadedatos.net


Introduction

The ARAR algorithm is a specialized time-series forecasting method designed to handle data with long-memory components or complex trends. Unlike traditional models that rely on rigid structures to stationarize data, ARAR employs a dynamic, two-step process: transforming the data through "memory shortening" and then modeling the remaining signal. This approach allows ARAR to be highly automated, making it a powerful alternative for scenarios where manual model selection is impractical.

The ARAR algorithm is defined by its unique workflow, which separates the data cleaning process from the prediction process.

Phase 1: The Memory Shortening Process

Before any forecasting occurs, ARAR applies a series of adaptive filters to the raw data.

  • Goal: To decorrelate the data and remove long-term dependencies (trends and seasonality) that confuse standard models.
  • Method: It uses "memory-shortening" filters. Rather than just looking at the immediately preceding time step, these filters can subtract a fraction of a value from a longer lag (e.g., a week or a year ago).
  • Result: Transformed data that is "short-memory," making it easier to predict using simple autoregressive rules.

Phase 2: Fitting the Model

Once the data has been transformed, the algorithm fits a standard model to the residuals.

  • Model Type: It utilizes an AR (Autoregressive) model.
  • Simplicity: Because Phase 1 does the heavy lifting of removing complex patterns, Phase 2 only needs to look at past values (an "all-pole" model) rather than modeling random shocks (Moving Averages).

ARAR vs. ARIMA

While both methods aim to predict future values based on history, they approach the problem from fundamentally different angles.

Feature ARIMA (Auto-Regressive Integrated Moving Average) ARAR (Memory Shortening + AR)
Transformation Fixed Differencing. It removes trends by subtracting t from t1 (the "Integration" step). Adaptive Filters. It uses memory-shortening filters to subtract fractions of values from various time lags.
Model Form ARMA. It fits a combination of Autoregressive (AR) and Moving Average (MA) terms. AR Only. It fits a pure Autoregressive model to the transformed data.
Automation Semi-Manual. Often requires human judgment to select orders (p,d,q) or complex selection criteria. Fully Automated. Designed to run with minimal human intervention, making it ideal for bulk processing.

ARAR model theory

The ARAR model applies a memory-shortening transformation if the underlying process of a given time series Yt,t=1,2,...,n is "long-memory" then it fits an autoregressive model.

Memory Shortening

The model follows five steps to classify Yt and take one of the following three actions:

  • L: declare Yt as long memory and form Yt by Y~t=Ytϕ^Ytτ^
  • M: declare Yt as moderately long memory and form Yt by Y~t=Ytϕ^1Yt1ϕ^2Yt2
  • S: declare Yt as short memory.

If Yt declared to be L or M then the series Yt is transformed again until. The transformation process continuous until the transformed series is classified as short memory. However, the maximum number of transformation process is three, it is very rare a time series require more than 2.

    1. For each τ=1,2,...,15, we find the value ϕ(τ)^ of ϕ^ that minimizes ERR(ϕ,τ)=t=τ+1n[YtϕYtτ]2t=τ+1nYt2 then define Err(τ)=ERR(ϕ(τ),τ^) and choose the lag τ^ to be the value of τ that minimizes Err(τ).
    1. If Err(τ^)8/n, Yt is a long-memory series.
    1. If ϕ^(τ^)0.93 and τ^>2, Yt is a long-memory series.
    1. If ϕ^(τ^)0.93 and τ^=1 or 2, Yt is a long-memory series.
    1. If ϕ^(τ^)<0.93, Yt is a short-memory series.

Subset Autoregressive Model:

In the following we will describe how ARAR algorithm fits an autoregressive process to the mean-corrected series Xt=StS¯, t=k+1,...,n where St,t=k+1,...,n is the memory-shortened version of Yt which derived from the five steps we described above and S¯ is the sample mean of Sk+1,...,Sn.

The fitted model has the following form:

Xt=ϕ1Xt1+ϕ1Xtl1+ϕ1Xtl1+ϕ1Xtl1+Z

where ZWN(0,σ2). The coefficients ϕj and white noise variance σ2 can be derived from the Yule-Walker equations for given lags l1,l2, and l3 :

(1)[1ρ^(l11)ρ^(l21)ρ^(l31)ρ^(l11)1ρ^(l2l1)ρ^(l3l1)ρ^(l21)ρ^(l2l1)1ρ^(l2l2)ρ^(l31)ρ^(l3l1)ρ^(l3l1)1][ϕ1ϕl1ϕl2ϕl3]=[ρ^(1)ρ^(l1)ρ^(l2)ρ^(l3)]

and σ2=γ^(0)[1ϕ1ρ^(1)]ϕl1ρ^(l1)]ϕl2ρ^(l2)]ϕl3ρ^(l3)], where γ^(j) and ρ^(j),j=0,1,2,..., are the sample autocovariances and autocorelations of the series Xt.

The algorithm computes the coefficients of ϕ(j) for each set of lags where 1<l1<l2<l3m where m chosen to be 13 or 26. The algorithm selects the model that the Yule-Walker estimate of σ2 is minimal.

Forecasting

If short-memory filter found in first step it has coefficients Ψ0,Ψ1,...,Ψk(k0) where Ψ0=1. In this case the transforemed series can be expressed as (2)St=Ψ(B)Yt=Yt+Ψ1Yt1+...+ΨkYtk, where Ψ(B)=1+Ψ1B+...+ΨkBk is polynomial in the back-shift operator.

If the coefficients of the subset autoregression found in the second step it has coefficients ϕ1,ϕl1,ϕl2 and ϕl3 then the subset AR model for Xt=StS¯ is

(3)ϕ(B)Xt=Zt,

where Zt is a white-noise series with zero mean and constant variance and ϕ(B)=1ϕ1Bϕl1Bl1ϕl2Bl2ϕl3Bl3. From equation (1) and (2) one can obtain

(4)ξ(B)Yt=ϕ(1)S¯+Zt,

where ξ(B)=Ψ(B)ϕ(B).

Assuming the fitted model in equation (3) is an appropriate model, and Zt is uncorrelated with Yj,j<t tT, one can determine minimum mean squared error linear predictors PnYn+h of Yn+h in terms of 1,Y1,...,Yn for n>k+l3, from recursions

(5)PnYn+h=j=1k+l3ξPnYn+hj+ϕ(1)S¯,h1,

with the initial conditions PnYn+h=Yn+h, for h0.

Ref: Brockwell, Peter J, and Richard A. Davis. Introduction to Time Series and Forecasting. Springer (2016)

ℹ️ Note

The python implementation of the ARAR algorithm in skforecast is based on the Julia package Durbyn.jl develop by Resul Akay.

Libraries and data

# Libraries
# ==============================================================================
import matplotlib.pyplot as plt
from skforecast.stats import Arar
from skforecast.recursive import ForecasterStats
from skforecast.model_selection import TimeSeriesFold, backtesting_stats
from skforecast.datasets import fetch_dataset
from skforecast.plot import set_dark_theme
# Download data
# ==============================================================================
data = fetch_dataset(name='fuel_consumption', raw=False)
data = data.loc[:'1990-01-01 00:00:00']
y = data['Gasolinas'].rename('y').rename_axis('date')
y
╭──────────────────────────────── fuel_consumption ────────────────────────────────╮
│ Description:                                                                     │
│ Monthly fuel consumption in Spain from 1969-01-01 to 2022-08-01.                 │
│                                                                                  │
│ Source:                                                                          │
│ Obtained from Corporación de Reservas Estratégicas de Productos Petrolíferos and │
│ Corporación de Derecho Público tutelada por el Ministerio para la Transición     │
│ Ecológica y el Reto Demográfico. https://www.cores.es/es/estadisticas            │
│                                                                                  │
│ URL:                                                                             │
│ https://raw.githubusercontent.com/skforecast/skforecast-                         │
│ datasets/main/data/consumos-combustibles-mensual.csv                             │
│                                                                                  │
│ Shape: 644 rows x 5 columns                                                      │
╰──────────────────────────────────────────────────────────────────────────────────╯
date
1969-01-01    166875.2129
1969-02-01    155466.8105
1969-03-01    184983.6699
1969-04-01    202319.8164
1969-05-01    206259.1523
                 ...     
1989-09-01    687649.2852
1989-10-01    669889.1602
1989-11-01    601413.8867
1989-12-01    663568.1055
1990-01-01    610241.2461
Freq: MS, Name: y, Length: 253, dtype: float64

ARAR

Skforecast provides the class ARAR to facilitate the implementation of ARAR models in Python, allowing users to easily fit and forecast time series data using this approach.

# ARAR model
# ==============================================================================
model = Arar()
model.fit(y)
Arar(max_ar_depth=26, max_lag=40)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

Once the model is fitted, future observations can be forecasted using the predict and predict_interval methods.

# Prediction
# ==============================================================================
model.predict(steps=10)
array([576270.29065713, 711350.90294941, 645064.14251878, 699974.70526107,
       693641.4876215 , 813391.3131971 , 849840.34223407, 728834.11322404,
       698899.25161967, 640834.1450568 ])
# Prediction interval
# ==============================================================================
model.predict_interval(steps=10, level=[95])
mean lower_95 upper_95
step
1 576270.290657 535285.283204 617255.298110
2 711350.902949 669971.979006 752729.826893
3 645064.142519 597182.272797 692946.012240
4 699974.705261 651641.922678 748307.487844
5 693641.487621 643148.019307 744134.955936
6 813391.313197 762568.607694 864214.018700
7 849840.342234 798207.532180 901473.152288
8 728834.113224 677001.238600 780666.987848
9 698899.251620 646741.565713 751056.937527
10 640834.145057 588566.126391 693102.163723

ForecasterStats

The previous section introduced the construction of ARAR models. In order to seamlessly integrate these models with the various functionalities provided by skforecast, the next step is to encapsulate the skforecast ARAR model within a ForecasterStats object. This encapsulation harmonizes the intricacies of the model and allows for the coherent use of skforecast's extensive capabilities.

# Create and fit ForecasterStats
# ==============================================================================
forecaster = ForecasterStats(estimator=Arar())
forecaster.fit(y=y)
forecaster

ForecasterStats

General Information
  • Estimator: Arar
  • Window size: 1
  • Series name: y
  • Exogenous included: False
  • Creation date: 2025-11-28 02:18:22
  • Last fit date: 2025-11-28 02:18:22
  • Skforecast version: 0.19.0
  • Python version: 3.13.9
  • Forecaster id: None
Exogenous Variables
    None
Data Transformations
  • Transformer for y: None
  • Transformer for exog: None
Training Information
  • Training range: [Timestamp('1969-01-01 00:00:00'), Timestamp('1990-01-01 00:00:00')]
  • Training index type: DatetimeIndex
  • Training index frequency: MS
Estimator Parameters
    {'max_ar_depth': 26, 'max_lag': 40, 'safe': True}
Fit Kwargs
    {}

🛈 API Reference    🗎 User Guide

# Feature importances
# ==============================================================================
forecaster.get_feature_importances()
feature importance
0 lag_2 0.568527
1 lag_14 0.318155
2 lag_1 0.138978
3 lag_12 -0.351038

Prediction

# Predict
# ==============================================================================
predictions = forecaster.predict(steps=10)
predictions.head(3)
1990-02-01    576270.290657
1990-03-01    711350.902949
1990-04-01    645064.142519
Freq: MS, Name: pred, dtype: float64
# Predict intervals
# ==============================================================================
predictions = forecaster.predict_interval(steps=36, alpha=0.05)
predictions.head(3)
pred lower_bound upper_bound
1990-02-01 576270.290657 535285.283204 617255.298110
1990-03-01 711350.902949 669971.979006 752729.826893
1990-04-01 645064.142519 597182.272797 692946.012240

Backtesting

ARAR and other statistical models, once integrated in a ForecasterStats object, can be evaluated using any of the backtesting strategies implemented in skforecast.

# Backtesting
# ==============================================================================
cv = TimeSeriesFold(
    initial_train_size = 150,
    steps              = 12,
    refit              = True,
)

metric, predictions = backtesting_stats(
    y               = y,
    forecaster      = forecaster,
    cv              = cv,
    interval        = [2.5, 97.5],
    metric          = 'mean_absolute_error',
    verbose         = False
)
# Backtest predictions
# ==============================================================================
predictions.head(4)
fold pred lower_bound upper_bound
1981-07-01 0 585006.456464 548872.543529 621140.369400
1981-08-01 0 632872.256680 596247.977571 669496.535788
1981-09-01 0 515431.057548 474418.134356 556443.980739
1981-10-01 0 523423.286271 481982.529292 564864.043250
# Plot predictions
# ==============================================================================
set_dark_theme()
fig, ax = plt.subplots(figsize=(7, 4))
y.loc[predictions.index].plot(ax=ax, label='y')
predictions['pred'].plot(ax=ax, label='predictions')
ax.fill_between(
        predictions.index,
        predictions['lower_bound'],
        predictions['upper_bound'],
        label='prediction interval',
        color='gray',
        alpha=0.6,
        zorder=1
    )
plt.legend()
plt.show()

Session information

import session_info
session_info.show(html=False)
-----
matplotlib          3.10.8
pandas              2.3.3
session_info        v1.0.1
skforecast          0.19.0
-----
IPython             9.7.0
jupyter_client      8.6.3
jupyter_core        5.9.1
-----
Python 3.13.9 | packaged by conda-forge | (main, Oct 22 2025, 23:12:41) [MSC v.1944 64 bit (AMD64)]
Windows-11-10.0.26100-SP0
-----
Session information updated at 2025-11-28 02:18

Citation

How to cite this document

If you use this document or any part of it, please acknowledge the source, thank you!

ARAR forecasting models in Python by Joaquín Amat Rodrigo and Javier Escobar Ortiz, available under Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0 DEED) at https://cienciadedatos.net/documentos/py73-arar-forecasting-models-python.html

How to cite skforecast

If you use skforecast for a publication, we would appreciate if you cite the published software.

Zenodo:

Amat Rodrigo, Joaquin, & Escobar Ortiz, Javier. (2024). skforecast (v0.19.0). Zenodo. https://doi.org/10.5281/zenodo.8382788

APA:

Amat Rodrigo, J., & Escobar Ortiz, J. (2024). skforecast (Version 0.19.0) [Computer software]. https://doi.org/10.5281/zenodo.8382788

BibTeX:

@software{skforecast, author = {Amat Rodrigo, Joaquin and Escobar Ortiz, Javier}, title = {skforecast}, version = {0.19.0}, month = {11}, year = {2025}, license = {BSD-3-Clause}, url = {https://skforecast.org/}, doi = {10.5281/zenodo.8382788} }


Did you like the article? Your support is important

Your contribution will help me to continue generating free educational content. Many thanks! 😊

Become a GitHub Sponsor Become a GitHub Sponsor

Creative Commons Licence

This work by Joaquín Amat Rodrigo, Javier Escobar Ortiz and Resul Akay is licensed under a Attribution-NonCommercial-ShareAlike 4.0 International.

Allowed:

  • Share: copy and redistribute the material in any medium or format.

  • Adapt: remix, transform, and build upon the material.

Under the following terms:

  • Attribution: You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.

  • NonCommercial: You may not use the material for commercial purposes.

  • ShareAlike: If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.