Table of Contents
Fetching ...

Invariant Measures for Data-Driven Dynamical System Identification: Analysis and Application

Jonah Botvinick-Greenhouse

TL;DR

The paper addresses data-driven dynamical system identification by matching observed physical invariant measures rather than pointwise trajectory data, thus attaining robustness to noise, chaos, and slow sampling. It develops a PDE-constrained approach using stationary Fokker–Planck surrogates, coupled with gradient-based optimization, and enhances scalability via a data-adaptive, Galerkin-based PFO approximation with Monte Carlo integration. A key theoretical advance is the use of time-delay coordinates (Takens embedding) to achieve uniqueness in identification, supported by proofs linking delay-measure equality to topological conjugacy, and by practical demonstrations with multiple observables. Numerical experiments across synthetic systems and high-dimensional models (e.g., Lorenz-96 and Hall-effect thruster data) show accurate velocity recovery, scalable PFO approximations, and reliable uncertainty quantification, illustrating the practical impact for robust, data-driven dynamical modeling.

Abstract

We propose a novel approach for performing dynamical system identification, based upon the comparison of simulated and observed physical invariant measures. While standard methods adopt a Lagrangian perspective by directly treating time-trajectories as inference data, we take on an Eulerian perspective and instead seek models fitting the observed global time-invariant statistics. With this change in perspective, we gain robustness against pervasive challenges in system identification including noise, chaos, and slow sampling. In the first half of this paper, we pose the system identification task as a partial differential equation (PDE) constrained optimization problem, in which synthetic stationary solutions of the Fokker-Planck equation, obtained as fixed points of a finite-volume discretization, are compared to physical invariant measures extracted from observed trajectory data. In the latter half of the paper, we improve upon this approach in two crucial directions. First, we develop a Galerkin-inspired modification to the finite-volume surrogate model, based on data-adaptive unstructured meshes and Monte-Carlo integration, enabling the approach to efficiently scale to high-dimensional problems. Second, we leverage Takens' seminal time-delay embedding theory to introduce a critical data-dependent coordinate transformation which can guarantee unique system identifiability from the invariant measure alone. This contribution resolves a major challenge of system identification through invariant measures, as systems exhibiting distinct transient behaviors may still share the same time-invariant statistics in their state-coordinates. Throughout, we present comprehensive numerical tests which highlight the effectiveness of our approach on a variety of challenging system identification tasks.

Invariant Measures for Data-Driven Dynamical System Identification: Analysis and Application

TL;DR

The paper addresses data-driven dynamical system identification by matching observed physical invariant measures rather than pointwise trajectory data, thus attaining robustness to noise, chaos, and slow sampling. It develops a PDE-constrained approach using stationary Fokker–Planck surrogates, coupled with gradient-based optimization, and enhances scalability via a data-adaptive, Galerkin-based PFO approximation with Monte Carlo integration. A key theoretical advance is the use of time-delay coordinates (Takens embedding) to achieve uniqueness in identification, supported by proofs linking delay-measure equality to topological conjugacy, and by practical demonstrations with multiple observables. Numerical experiments across synthetic systems and high-dimensional models (e.g., Lorenz-96 and Hall-effect thruster data) show accurate velocity recovery, scalable PFO approximations, and reliable uncertainty quantification, illustrating the practical impact for robust, data-driven dynamical modeling.

Abstract

We propose a novel approach for performing dynamical system identification, based upon the comparison of simulated and observed physical invariant measures. While standard methods adopt a Lagrangian perspective by directly treating time-trajectories as inference data, we take on an Eulerian perspective and instead seek models fitting the observed global time-invariant statistics. With this change in perspective, we gain robustness against pervasive challenges in system identification including noise, chaos, and slow sampling. In the first half of this paper, we pose the system identification task as a partial differential equation (PDE) constrained optimization problem, in which synthetic stationary solutions of the Fokker-Planck equation, obtained as fixed points of a finite-volume discretization, are compared to physical invariant measures extracted from observed trajectory data. In the latter half of the paper, we improve upon this approach in two crucial directions. First, we develop a Galerkin-inspired modification to the finite-volume surrogate model, based on data-adaptive unstructured meshes and Monte-Carlo integration, enabling the approach to efficiently scale to high-dimensional problems. Second, we leverage Takens' seminal time-delay embedding theory to introduce a critical data-dependent coordinate transformation which can guarantee unique system identifiability from the invariant measure alone. This contribution resolves a major challenge of system identification through invariant measures, as systems exhibiting distinct transient behaviors may still share the same time-invariant statistics in their state-coordinates. Throughout, we present comprehensive numerical tests which highlight the effectiveness of our approach on a variety of challenging system identification tasks.

Paper Structure

This paper contains 36 sections, 18 theorems, 82 equations, 11 figures, 2 tables.

Key Result

Theorem 2.1

Let $T:U \to U$ be a diffeomorphism of an open set $U\subseteq \mathbb{R}^n$, $A\subseteq U$ be compact and $m > 2d$ where $d=\textup{boxdim}(A)$. Suppose that the periodic points of $T$ with degree at most $m$ satisfy Assumption assumption:1 on $A$. Then, it holds that $x\mapsto (y(x),y(T(x)),\dots

Figures (11)

  • Figure 1: Comparison with the SINDy and the Neural ODE frameworks for reconstructing the velocity from slowly sampled observations. While SINDy and Neural ODE can only reconstruct an accurate model from a quickly sampled trajectory, our approach is robust to slowly sampled data. See Section \ref{['sec:numerics']} for additional experiment details.
  • Figure 2: Flowchart describing the paper's main sections and techniques.
  • Figure 3: Delay-coordinate invariant measures improve identifiability of the torus rotation $T_{\alpha,\beta}(z_1,z_2)= (z_1+\alpha,z_2+\beta) \pmod{ 1}.$ While different choices of $(\alpha,\beta)$ all lead to the same state-coordinate invariant measure (top row), the systems can be distinguished by their delay-invariant measures (bottom row).
  • Figure 4: As the mesh size of the forward model discretization is refined, we visually observe the convergence of the computed steady-state solution (a) to the approximate physical measure (b). The Van der Pol oscillator with $c = 1$ and $D = 0.001$ is used in this example.
  • Figure 5: Reconstructing the $\dot{x}$ component of the Lorenz-63 system from its observed, noisy occupation measure. We used a mesh-spacing of $\Delta x = 2$ and a diffusion coefficient of $D = 10.$
  • ...and 6 more figures

Theorems & Definitions (42)

  • Definition 2.1: Perron--Frobenius operator
  • Definition 2.2: Support of a measure
  • Definition 2.3: Pushforward measure
  • Definition 2.4: Invariant measure
  • Definition 2.5: Basin of attraction
  • Theorem 2.1: Fractal Takens' Embedding
  • Theorem 4.1
  • Proposition 4.1
  • Proposition 5.1
  • Definition 5.1: Time-delay map
  • ...and 32 more