Table of Contents
Fetching ...

Adaptive Nonlinear Vector Autoregression: Robust Forecasting for Noisy Chaotic Time Series

Azimov Sherkhon, Susana Lopez-Moreno, Eric Dolores-Cuenca, Sieun Lee, Sangil Kim

TL;DR

Adaptive NVAR replaces fixed nonlinear feature bases with a trainable, shallow MLP to learn data-driven nonlinearities while jointly optimizing the linear readout. By incorporating a skip-connection and end-to-end gradient-based training, it achieves robust forecasting of chaotic time series under noise and scales to high-dimensional systems without explicit large-matrix inversions. Across Mackey–Glass, Lorenz–63, and Lorenz–96, it outperforms standard NVAR, leaky ESN, and Hybrid ESN, with particular strength in noisy and high-dimensional regimes. The approach enables GPU-accelerated training and demonstrates practical applicability to geophysical and other complex time-series forecasting tasks.

Abstract

Nonlinear vector autoregression (NVAR) and reservoir computing (RC) have shown promise in forecasting chaotic dynamical systems, such as the Lorenz-63 model and El Nino-Southern Oscillation. However, their reliance on fixed nonlinear transformations - polynomial expansions in NVAR or random feature maps in RC - limits their adaptability to high noise or complex real-world data. Furthermore, these methods also exhibit poor scalability in high-dimensional settings due to costly matrix inversion during optimization. We propose a data-adaptive NVAR model that combines delay-embedded linear inputs with features generated by a shallow, trainable multilayer perceptron (MLP). Unlike standard NVAR and RC models, the MLP and linear readout are jointly trained using gradient-based optimization, enabling the model to learn data-driven nonlinearities, while preserving a simple readout structure and improving scalability. Initial experiments across multiple chaotic systems, tested under noise-free and synthetically noisy conditions, showed that the adaptive model outperformed in predictive accuracy the standard NVAR, a leaky echo state network (ESN) - the most common RC model - and a hybrid ESN, thereby showing robust forecasting under noisy conditions.

Adaptive Nonlinear Vector Autoregression: Robust Forecasting for Noisy Chaotic Time Series

TL;DR

Adaptive NVAR replaces fixed nonlinear feature bases with a trainable, shallow MLP to learn data-driven nonlinearities while jointly optimizing the linear readout. By incorporating a skip-connection and end-to-end gradient-based training, it achieves robust forecasting of chaotic time series under noise and scales to high-dimensional systems without explicit large-matrix inversions. Across Mackey–Glass, Lorenz–63, and Lorenz–96, it outperforms standard NVAR, leaky ESN, and Hybrid ESN, with particular strength in noisy and high-dimensional regimes. The approach enables GPU-accelerated training and demonstrates practical applicability to geophysical and other complex time-series forecasting tasks.

Abstract

Nonlinear vector autoregression (NVAR) and reservoir computing (RC) have shown promise in forecasting chaotic dynamical systems, such as the Lorenz-63 model and El Nino-Southern Oscillation. However, their reliance on fixed nonlinear transformations - polynomial expansions in NVAR or random feature maps in RC - limits their adaptability to high noise or complex real-world data. Furthermore, these methods also exhibit poor scalability in high-dimensional settings due to costly matrix inversion during optimization. We propose a data-adaptive NVAR model that combines delay-embedded linear inputs with features generated by a shallow, trainable multilayer perceptron (MLP). Unlike standard NVAR and RC models, the MLP and linear readout are jointly trained using gradient-based optimization, enabling the model to learn data-driven nonlinearities, while preserving a simple readout structure and improving scalability. Initial experiments across multiple chaotic systems, tested under noise-free and synthetically noisy conditions, showed that the adaptive model outperformed in predictive accuracy the standard NVAR, a leaky echo state network (ESN) - the most common RC model - and a hybrid ESN, thereby showing robust forecasting under noisy conditions.

Paper Structure

This paper contains 14 sections, 23 equations, 5 figures, 5 tables, 1 algorithm.

Figures (5)

  • Figure 1: Comparison of the standard NVAR (left) and Adaptive NVAR (right). In the standard formulation, the nonlinear feature vector is created as a (quadratic) polynomial and the linear readout matrix $W_{\text{out}}$ is computed as the closed-form solution of a least-squares regression with Tikhonov regularization (ridge regression). In contrast, the adaptive model employs a trained MLP to generate $H_{\mathcal{NN}}$, while $W_{\text{out}}$ is treated as a trainable weight matrix within the skip-connection architecture of the adaptive model via gradient descent.
  • Figure 2: Forecasting performance across dynamical systems and noise regimes. Root mean square error (RMSE, log scale) of all models evaluated on the Mackey–Glass, Lorenz–63, and Lorenz–96 systems for increasing forecast horizons (25–100 steps) under four noise conditions: noise-free, low (10%), moderate (20%), and high (30%). Bars and error caps denote the mean and standard deviation computed over multiple independent, non-overlapping forecast windows. For the high-dimensional Lorenz–96 system, only Adaptive NVAR was benchmarked, as the standard NVAR encountered a memory bottleneck and the ESN and HESN models faced prohibitive runtime constraints.
  • Figure 3: Forecasting performance of Mackey–Glass system under high noise (30%). The top subplot illustrates trajectories of the ground-truth signal (red, dotted), noisy input (black), and predictions of the four models (colored lines) across ten non-overlapping forecasting windows. Vertical dotted lines indicate non-overlapping forecast windows of length $100$ time steps, used to compute independent window-wise RMSE. The bottom subplot depicts the window-wise RMSE(t) computed independently for each non-overlapping forecast window.
  • Figure 4: Forecasting performance of Lorenz–63 system under high noise (30%). The first three subplots display the true signal (red, dotted), noisy input (black), and predictions of the four models (colored lines) for each state variable. The bottom subplot illustrates the window-wise RMSE(t) computed independently for each non-overlapping forecast window.
  • Figure 5: Forecasting performance of the high-dimensional Lorenz–96 system (30% noise). The plot shows the true signal (red, dotted), noisy input (black), and Adaptive NVAR predictions (green) for the first five state variables and the first ten non-overlapping forecast windows (each of length $100$ time steps).