Table of Contents
Fetching ...

Out-of-Domain Generalization in Dynamical Systems Reconstruction

Niclas Göring, Florian Hess, Manuel Brenner, Zahra Monfared, Daniel Durstewitz

TL;DR

This work tackles the problem of out-of-domain generalization (OODG) in dynamical systems reconstruction (DSR), focusing on extrapolation to unseen dynamical regimes in multistable systems. It introduces a principled framework grounded in measure theory and topology, defining statistical and topological generalization errors via $SW_1$ distances between occupation measures and $d_H$ Hausdorff distances between $oldsymbol{55}$-limit sets. The authors prove that black-box deep-learning approaches lack necessary structural priors to guarantee OODG and validate this with extensive experiments on Duffing and Lorenz-like systems, showing failures to generalize across basins. They show that strong priors via libraries like SINDy can achieve strict OODG under identifiability conditions, while universal approximators generally do not, and discuss how initialization and optimization biases bias the search toward monostable or saddle regimes, offering directions to promote multistability-aware training and physics-informed priors. Code is released to enable reproducibility and further exploration.

Abstract

In science we are interested in finding the governing equations, the dynamical rules, underlying empirical phenomena. While traditionally scientific models are derived through cycles of human insight and experimentation, recently deep learning (DL) techniques have been advanced to reconstruct dynamical systems (DS) directly from time series data. State-of-the-art dynamical systems reconstruction (DSR) methods show promise in capturing invariant and long-term properties of observed DS, but their ability to generalize to unobserved domains remains an open challenge. Yet, this is a crucial property we would expect from any viable scientific theory. In this work, we provide a formal framework that addresses generalization in DSR. We explain why and how out-of-domain (OOD) generalization (OODG) in DSR profoundly differs from OODG considered elsewhere in machine learning. We introduce mathematical notions based on topological concepts and ergodic theory to formalize the idea of learnability of a DSR model. We formally prove that black-box DL techniques, without adequate structural priors, generally will not be able to learn a generalizing DSR model. We also show this empirically, considering major classes of DSR algorithms proposed so far, and illustrate where and why they fail to generalize across the whole phase space. Our study provides the first comprehensive mathematical treatment of OODG in DSR, and gives a deeper conceptual understanding of where the fundamental problems in OODG lie and how they could possibly be addressed in practice.

Out-of-Domain Generalization in Dynamical Systems Reconstruction

TL;DR

This work tackles the problem of out-of-domain generalization (OODG) in dynamical systems reconstruction (DSR), focusing on extrapolation to unseen dynamical regimes in multistable systems. It introduces a principled framework grounded in measure theory and topology, defining statistical and topological generalization errors via distances between occupation measures and Hausdorff distances between -limit sets. The authors prove that black-box deep-learning approaches lack necessary structural priors to guarantee OODG and validate this with extensive experiments on Duffing and Lorenz-like systems, showing failures to generalize across basins. They show that strong priors via libraries like SINDy can achieve strict OODG under identifiability conditions, while universal approximators generally do not, and discuss how initialization and optimization biases bias the search toward monostable or saddle regimes, offering directions to promote multistability-aware training and physics-informed priors. Code is released to enable reproducibility and further exploration.

Abstract

In science we are interested in finding the governing equations, the dynamical rules, underlying empirical phenomena. While traditionally scientific models are derived through cycles of human insight and experimentation, recently deep learning (DL) techniques have been advanced to reconstruct dynamical systems (DS) directly from time series data. State-of-the-art dynamical systems reconstruction (DSR) methods show promise in capturing invariant and long-term properties of observed DS, but their ability to generalize to unobserved domains remains an open challenge. Yet, this is a crucial property we would expect from any viable scientific theory. In this work, we provide a formal framework that addresses generalization in DSR. We explain why and how out-of-domain (OOD) generalization (OODG) in DSR profoundly differs from OODG considered elsewhere in machine learning. We introduce mathematical notions based on topological concepts and ergodic theory to formalize the idea of learnability of a DSR model. We formally prove that black-box DL techniques, without adequate structural priors, generally will not be able to learn a generalizing DSR model. We also show this empirically, considering major classes of DSR algorithms proposed so far, and illustrate where and why they fail to generalize across the whole phase space. Our study provides the first comprehensive mathematical treatment of OODG in DSR, and gives a deeper conceptual understanding of where the fundamental problems in OODG lie and how they could possibly be addressed in practice.
Paper Structure (56 sections, 8 theorems, 97 equations, 23 figures, 5 tables)

This paper contains 56 sections, 8 theorems, 97 equations, 23 figures, 5 tables.

Key Result

Theorem 3.3

Assume $\Phi$ is multistable with decomposition as in Eq. eq_statespacedecomp and connected basins, and there exists one attractor $A_k, \ k \leq n$, not reconstructed by $\Phi_R$. Then, the generalization error of $\Phi_R$ is proportional to the volume of the basin of this non-reconstructed attract This statement naturally generalizes to the case of multiple non-reconstructed attractors (with dif

Figures (23)

  • Figure 1: In-distribution generalization within one basin (right; van-der-Pol oscillator) vs. OODG across basins (left; neuron model with a limit cycle corresponding to spiking activity and an equilibrium point corresponding to the resting potential).
  • Figure 2: a) Example reconstructions using SINDy (details in Appx. \ref{['appx:sindy']}). The underlying VF has two cycle solutions. One solves an algebraic equation (red), while the other does not (black). The VF is only correctly identified from a trajectory containing the inner cycle (center), but not for the outer cycle (right). b) SINDy needs the proper function library to correctly infer a system across the whole state space (center). If the $3$rd order term present in the Duffing equations is lacking (right), the inferred VF may only be locally correct (or not at all for more complex systems).
  • Figure 3: Learnability of three SOTA DSR algorithms evaluated on the Duffing system in a multistable regime. a) Reconstructions of DSR models trained on four ground-truth trajectories (blue) from one basin. Red trajectories are freely generated using initial conditions of the training data and the respective DSR model. Grey trajectories comprise example ground-truth test trajectories and generated ones from both the training basin and OOD basin. While training data trajectories align with the ground-truth, all models fail to properly generalize to the unobserved attractor/basin. b) Empirical cumulative distribution function (eCDF) of both $\mathcal{E}_{\mathrm{stat}}$ and $\mathcal{E}_{\mathrm{top}}$ based on $N=50$ independent trainings of each DSR model evaluated over a grid of initial conditions covering both basins (see Fig. \ref{['fig:duffing_U_grid']}).
  • Figure 4: a) Distribution of Shannon entropies (in Nat) for the limit sets of shPLRNNs ($M=2, H=100$) initialized with different gains (parameter variances) using the Glorot uniform scheme. For a low gain ($\sigma =0.3$), as predominantly used in DSR, the attractors of all models at initialization had $H=0$, which means that these consisted only of a single equilibrium point. For higher gains, further peaks at $H>0$ started to appear, implying that either more and/or higher-order objects (like cycles) exist upon initialization. b) Mean Shannon entropy for the same data plotted against gain, using the Glorot uniform and Glorot normal initialization scheme.
  • Figure 5: a) Statistical error distribution on basins $B(A_1)$ and $B(A_2)$ for $20$ generalizing models (green) and $20 \times 20$ models retrained (purple) using only $B(A_1)$ data. b) Illustration of loss landscapes using data from just one (left) or both (right) basin(s) of attraction, with parameters corresponding to generalizing solution ($\bm{\theta}_{\mathrm{gen}}$), and to models retrained for $125k$ ($\bm{\theta}_{re}^1$) and $250k$ ($\bm{\theta}_{re}^2$) parameter updates, respectively. Note that $\ell_{\mathrm{M}}$ does not exhibit the spurious loss valley present in $\ell_{\mathrm{B(A_1)}}$.
  • ...and 18 more figures

Theorems & Definitions (30)

  • Definition 2.1
  • Definition 3.1
  • Definition 3.2
  • Theorem 3.3
  • proof
  • Definition 3.4
  • Definition 3.5
  • Theorem 4.1
  • proof
  • Theorem 4.2
  • ...and 20 more