Table of Contents
Fetching ...

Position: Why a Dynamical Systems Perspective is Needed to Advance Time Series Modeling

Daniel Durstewitz, Christoph Jürgen Hemmer, Florian Hess, Charlotte Ricarda Doll, Lukas Eisenmann

Abstract

Time series (TS) modeling has come a long way from early statistical, mainly linear, approaches to the current trend in TS foundation models. With a lot of hype and industrial demand in this field, it is not always clear how much progress there really is. To advance TS forecasting and analysis to the next level, here we argue that the field needs a dynamical systems (DS) perspective. TS of observations from natural or engineered systems almost always originate from some underlying DS, and arguably access to its governing equations would yield theoretically optimal forecasts. This is the promise of DS reconstruction (DSR), a class of ML/AI approaches that aim to infer surrogate models of the underlying DS from data. But models based on DS principles offer other profound advantages: Beyond short-term forecasts, they enable to predict the long-term statistics of an observed system, which in many practical scenarios may be the more relevant quantities. DS theory furthermore provides domain-independent theoretical insight into mechanisms underlying TS generation, and thereby will inform us, e.g., about upper bounds on performance of any TS model, generalization into unseen regimes as in tipping points, or potential control strategies. After reviewing some of the central concepts, methods, measures, and models in DS theory and DSR, we will discuss how insights from this field can advance TS modeling in crucial ways, enabling better forecasting with much lower computational and memory footprints. We conclude with a number of specific suggestions for translating insights from DSR into TS modeling.

Position: Why a Dynamical Systems Perspective is Needed to Advance Time Series Modeling

Abstract

Time series (TS) modeling has come a long way from early statistical, mainly linear, approaches to the current trend in TS foundation models. With a lot of hype and industrial demand in this field, it is not always clear how much progress there really is. To advance TS forecasting and analysis to the next level, here we argue that the field needs a dynamical systems (DS) perspective. TS of observations from natural or engineered systems almost always originate from some underlying DS, and arguably access to its governing equations would yield theoretically optimal forecasts. This is the promise of DS reconstruction (DSR), a class of ML/AI approaches that aim to infer surrogate models of the underlying DS from data. But models based on DS principles offer other profound advantages: Beyond short-term forecasts, they enable to predict the long-term statistics of an observed system, which in many practical scenarios may be the more relevant quantities. DS theory furthermore provides domain-independent theoretical insight into mechanisms underlying TS generation, and thereby will inform us, e.g., about upper bounds on performance of any TS model, generalization into unseen regimes as in tipping points, or potential control strategies. After reviewing some of the central concepts, methods, measures, and models in DS theory and DSR, we will discuss how insights from this field can advance TS modeling in crucial ways, enabling better forecasting with much lower computational and memory footprints. We conclude with a number of specific suggestions for translating insights from DSR into TS modeling.
Paper Structure (54 sections, 1 theorem, 46 equations, 24 figures, 3 tables)

This paper contains 54 sections, 1 theorem, 46 equations, 24 figures, 3 tables.

Key Result

Theorem 1.2

Let $\Phi: E \to E$ be a $C^r$ ($r \geq 2$) flow on an open set $E \subseteq \mathbb{R}^M$. Let $A \subseteq E$ be a compact set invariant under $\Phi$ (i.e., $\Phi(A) \subseteq A$) with box-counting dimension $d_{\text{box}}$. Let $g: \mathbb{R}^M \to \mathbb{R}$ be a smooth measurement function an Therefore, $H$ restricted to $A$ is a diffeomorphism onto its image $H(A)$.

Figures (24)

  • Figure 1: a) State space of a bistable neuron model with co-existing point (left basin) and limit cycle (right basin) attractor (see Appx. \ref{['app:neuron_model']} for details). The two basins are separated by the stable manifold of a saddle node. Trajectories follow the system's vector field, indicated by arrows with red shading indicating flow velocity (darker = faster). The corresponding TS (right) of model variables $V(t)$ (solid) and $n(t)$ (dashed) converge either to a point attractor (top) or a stable limit cycle (bottom) depending on the basin in which the trajectory was initialized (as indicated by the blue arrows). b) N-tipping in the true neuron model (gray) and in an AL-RNN (red) trained on trajectories from both basins. Noise was added in both models and eventually drives them across the basin boundary into cyclic activity (cf. Fig. \ref{['fig:N-tipping_state_space']}).
  • Figure 2: a) Bifurcation diagram of a spiking neuron model depending on a control parameter $h$ (see Appx. \ref{['app:neuron_model']}; durstewitz_implications_2009). Stable fixed points are indicated by solid black lines, unstable ones by dashed black lines. The gray-shaded area corresponds to a stable limit cycle. Graphs below give snapshots of the state space for different values of $h$. b) B-tipping from a 'bursting' into a 'spiking' regime in the full simulated neuron model (gray) is successfully predicted by an AL-RNN (red) trained only on TS data up to the black dashed line.
  • Figure 3: DSR measures: Comparison of geometrical (dis)agreement ($D_\textrm{stsp}$), power spectral distance ($D_\textrm{H}$), Kaplan-Yorke fractal dimension ($D_\textrm{KY}$), and max. Lyapunov exponent ($\lambda_\textrm{max}$) on a) a poor and b) a good reconstruction of the chaotic Lorenz-63 system by an AL-RNN. See also Fig. \ref{['fig:DSR_measures_all']}.
  • Figure 4: The limits of predictability: Three TS from the same Lorenz-63 system (same parameters) started at the same initial condition quickly diverge even with just $1\%$ of noise, while with $10\%$ noise prediction beyond 1 Lyapunov time becomes hopeless.
  • Figure 5: a): Long-term geometrical ($D_\textrm{stsp}$) and temporal ($D_\textrm{H}$) forecast accuracy (cf. Appx. \ref{['appx:measures']}) as median $\pm$ MAD, comparing DSR and TS models, evaluated on $10,000$-step roll-outs. Lower = better. Results for both custom-trained (triangles) and foundation models (dots) are shown. b): Short-term forecasting accuracy ($\text{MASE}$); lighter colors = better, green stars = best $\text{MASE}$. c): Example long-term forecasts of ETTh1 data for DSR and TS models, exposing the failure of TS models to capture long-term behavior. See also Figs. \ref{['fig:longterm_ETTh1']} & \ref{['fig:longterm_weather_temp']}.
  • ...and 19 more figures

Theorems & Definitions (5)

  • Definition 2.1: Attractor and basin of attraction
  • Definition 2.2: Topological equivalence and conjugacy
  • Definition 1.1
  • Theorem 1.2
  • Definition 1.3