Table of Contents
Fetching ...

Characteristic Root Analysis and Regularization for Linear Time Series Forecasting

Zheng Wang, Kaixuan Zhang, Wanfang Chen, Xiaonan Lu, Longyuan Li, Tobias Schlagenhauf

TL;DR

A systematic study of linear models for time series forecasting, with a focus on the role of characteristic roots in temporal dynamics, and proposes two complementary strategies for robust root restructuring.

Abstract

Time series forecasting remains a critical challenge across numerous domains, yet the effectiveness of complex models often varies unpredictably across datasets. Recent studies highlight the surprising competitiveness of simple linear models, suggesting that their robustness and interpretability warrant deeper theoretical investigation. This paper presents a systematic study of linear models for time series forecasting, with a focus on the role of characteristic roots in temporal dynamics. We begin by analyzing the noise-free setting, where we show that characteristic roots govern long-term behavior and explain how design choices such as instance normalization and channel independence affect model capabilities. We then extend our analysis to the noisy regime, revealing that models tend to produce spurious roots. This leads to the identification of a key data-scaling property: mitigating the influence of noise requires disproportionately large training data, highlighting the need for structural regularization. To address these challenges, we propose two complementary strategies for robust root restructuring. The first uses rank reduction techniques, including Reduced-Rank Regression and Direct Weight Rank Reduction, to recover the low-dimensional latent dynamics. The second, a novel adaptive method called Root Purge, encourages the model to learn a noise-suppressing null space during training. Extensive experiments on standard benchmarks demonstrate the effectiveness of both approaches, validating our theoretical insights and achieving state-of-the-art results in several settings. Our findings underscore the potential of integrating classical theories for linear systems with modern learning techniques to build robust, interpretable, and data-efficient forecasting models.

Characteristic Root Analysis and Regularization for Linear Time Series Forecasting

TL;DR

A systematic study of linear models for time series forecasting, with a focus on the role of characteristic roots in temporal dynamics, and proposes two complementary strategies for robust root restructuring.

Abstract

Time series forecasting remains a critical challenge across numerous domains, yet the effectiveness of complex models often varies unpredictably across datasets. Recent studies highlight the surprising competitiveness of simple linear models, suggesting that their robustness and interpretability warrant deeper theoretical investigation. This paper presents a systematic study of linear models for time series forecasting, with a focus on the role of characteristic roots in temporal dynamics. We begin by analyzing the noise-free setting, where we show that characteristic roots govern long-term behavior and explain how design choices such as instance normalization and channel independence affect model capabilities. We then extend our analysis to the noisy regime, revealing that models tend to produce spurious roots. This leads to the identification of a key data-scaling property: mitigating the influence of noise requires disproportionately large training data, highlighting the need for structural regularization. To address these challenges, we propose two complementary strategies for robust root restructuring. The first uses rank reduction techniques, including Reduced-Rank Regression and Direct Weight Rank Reduction, to recover the low-dimensional latent dynamics. The second, a novel adaptive method called Root Purge, encourages the model to learn a noise-suppressing null space during training. Extensive experiments on standard benchmarks demonstrate the effectiveness of both approaches, validating our theoretical insights and achieving state-of-the-art results in several settings. Our findings underscore the potential of integrating classical theories for linear systems with modern learning techniques to build robust, interpretable, and data-efficient forecasting models.

Paper Structure

This paper contains 91 sections, 16 theorems, 175 equations, 27 figures, 19 tables, 2 algorithms.

Key Result

Proposition 3

Let $\{ \mathbf{y}_t \in \mathbb{R}^m \}$ be a vector time series following a diagonal matrix recurrence of order $p$: where $\mathbf{D}_i = \text{diag}(d_i^{(1)}, \dots, d_i^{(m)})$ are diagonal matrices. Then, there exists an equivalent order-$L$ recurrence (where $L \le mp$) that can be written as: where each $\mathbf{D}_i' = k_i \mathbf{I}_m$ is a scaled identity matrix, with $k_i \in \mathb

Figures (27)

  • Figure 1: Structure of the paper and its main contributions.
  • Figure 2: Average forecasting MSE on ETTh1 and ETTm1 across horizons $H=\{96, 192, 336, 720\}$ for different values of $\lambda$. Results indicate that a wide range of $\lambda$ improves predictions, whereas larger values may cause over-regularization. A break-down table for each horizon is in Appendix \ref{['appd:exp:full-hyperparameter']}.
  • Figure 3: First 336 singular value magnitudes on ETTh1 and ETTm1 under different values of $\lambda$ (log scale). As $\lambda$ increases, Root Purge pushes the weight matrix $\mathbf{W}$ to have more smaller singular values, while the significant singular values remain largely unaffected.
  • Figure 4: Data scaling and noise robustness of state-of-the-art linear time-series models. (left) RRR and Root Purge exhibit near-constant performance in data-scaling benchmarks. (right) Both methods exhibit robust performance under increasing noise levels, outperforming baseline models.
  • Figure 5: A mind map of the paper describing its logic flow and how we organize this paper.
  • ...and 22 more figures

Theorems & Definitions (42)

  • Remark 1
  • Remark 2: On the Construction of $\mathbf{Y}_{\text{his}}$ and $\mathbf{Y}_{\text{fut}}$
  • Claim 1: Part I of Fact \ref{['prop: horizon_lookback']}
  • Claim 2: Part II of Fact \ref{['prop: horizon_lookback']}
  • Proposition 3: Equivalent Representation of Diagonal Recurrences
  • proof
  • Remark 3: Technical Condition for General Matrices
  • Proposition 4: Characteristic Roots Generalize up to Sequence Initial Conditions
  • proof
  • Corollary 1: Expressivity of Linear Recurrence Models
  • ...and 32 more