RI-Loss: A Learnable Residual-Informed Loss for Time Series Forecasting

Jieting Wang; Xiaolei Shang; Feijiang Li; Furong Peng

RI-Loss: A Learnable Residual-Informed Loss for Time Series Forecasting

Jieting Wang, Xiaolei Shang, Feijiang Li, Furong Peng

TL;DR

This work tackles the limitations of mean-squared-error losses in time-series forecasting by introducing RI-Loss, a kernel-based objective that enforces dependence between model residuals and random time-series noise using HSIC. The authors derive a non-asymptotic HSIC bound with double-sample Rademacher complexities and Bernstein-type concentration, providing rigorous generalization guarantees for the loss. Empirically, RI-Loss yields consistent improvements across eight real-world datasets and five backbone models (including Transformer and MLP architectures), while remaining competitive in runtime. The approach offers a principled, noise-aware framework for long-horizon forecasting with broad practical impact and publicly available code.

Abstract

Time series forecasting relies on predicting future values from historical data, yet most state-of-the-art approaches-including transformer and multilayer perceptron-based models-optimize using Mean Squared Error (MSE), which has two fundamental weaknesses: its point-wise error computation fails to capture temporal relationships, and it does not account for inherent noise in the data. To overcome these limitations, we introduce the Residual-Informed Loss (RI-Loss), a novel objective function based on the Hilbert-Schmidt Independence Criterion (HSIC). RI-Loss explicitly models noise structure by enforcing dependence between the residual sequence and a random time series, enabling more robust, noise-aware representations. Theoretically, we derive the first non-asymptotic HSIC bound with explicit double-sample complexity terms, achieving optimal convergence rates through Bernstein-type concentration inequalities and Rademacher complexity analysis. This provides rigorous guarantees for RI-Loss optimization while precisely quantifying kernel space interactions. Empirically, experiments across eight real-world benchmarks and five leading forecasting models demonstrate improvements in predictive performance, validating the effectiveness of our approach. The code is publicly available at: https://github.com/shang-xl/RI-Loss.

RI-Loss: A Learnable Residual-Informed Loss for Time Series Forecasting

TL;DR

Abstract

RI-Loss: A Learnable Residual-Informed Loss for Time Series Forecasting

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (9)

Theorems & Definitions (22)