Table of Contents
Fetching ...

Data assimilation and discrepancy modeling with shallow recurrent decoders

Yuxuan Bao, J. Nathan Kutz

TL;DR

The paper tackles the challenge of closing the simulation-to-reality (SIM2REAL) gap in high-dimensional, spatiotemporal systems with sparse sensors. It introduces DA-SHRED, a hybrid framework that leverages a SHRED-derived latent space trained on simulations, refines it with real sensor data, and uses SINDy to discover missing physics L' in the latent dynamics. Through demonstrations on 2D damped Kuramoto–Sivashinsky, 2D Kolmogorov flow, Gray–Scott reaction–diffusion, and rotating detonation engines, DA-SHRED achieves rapid convergence and accurate state reconstruction while recovering physically meaningful discrepancy terms. The approach combines temporal encoding, sparse sensing, and interpretable discrepancy modeling, offering a data-efficient path toward real-time assimilation and physics-informed correction in complex systems, with future extensions to adaptive bases and multiscale or stochastic dynamics.

Abstract

The requirements of modern sensing are rapidly evolving, driven by increasing demands for data efficiency, real-time processing, and deployment under limited sensing coverage. Complex physical systems are often characterized through the integration of a limited number of point sensors in combination with scientific computations which approximate the dominant, full-state dynamics. Simulation models, however, inevitably neglect small-scale or hidden processes, are sensitive to perturbations, or oversimplify parameter correlations, leading to reconstructions that often diverge from the reality measured by sensors. This creates a critical need for data assimilation, the process of integrating observational data with predictive simulation models to produce coherent and accurate estimates of the full state of complex physical systems. We propose a machine learning framework for Data Assimilation with a SHallow REcurrent Decoder (DA-SHRED) which bridges the simulation-to-real (SIM2REAL) gap between computational modeling and experimental sensor data. For real-world physics systems modeling high-dimensional spatiotemporal fields, where the full state cannot be directly observed and must be inferred from sparse sensor measurements, we leverage the latent space learned from a reduced simulation model via SHRED, and update these latent variables using real sensor data to accurately reconstruct the full system state. Furthermore, our algorithm incorporates a sparse identification of nonlinear dynamics based regression model in the latent space to identify functionals corresponding to missing dynamics in the simulation model. We demonstrate that DA-SHRED successfully closes the SIM2REAL gap and additionally recovers missing dynamics in highly complex systems, demonstrating that the combination of efficient temporal encoding and physics-informed correction enables robust data assimilation.

Data assimilation and discrepancy modeling with shallow recurrent decoders

TL;DR

The paper tackles the challenge of closing the simulation-to-reality (SIM2REAL) gap in high-dimensional, spatiotemporal systems with sparse sensors. It introduces DA-SHRED, a hybrid framework that leverages a SHRED-derived latent space trained on simulations, refines it with real sensor data, and uses SINDy to discover missing physics L' in the latent dynamics. Through demonstrations on 2D damped Kuramoto–Sivashinsky, 2D Kolmogorov flow, Gray–Scott reaction–diffusion, and rotating detonation engines, DA-SHRED achieves rapid convergence and accurate state reconstruction while recovering physically meaningful discrepancy terms. The approach combines temporal encoding, sparse sensing, and interpretable discrepancy modeling, offering a data-efficient path toward real-time assimilation and physics-informed correction in complex systems, with future extensions to adaptive bases and multiscale or stochastic dynamics.

Abstract

The requirements of modern sensing are rapidly evolving, driven by increasing demands for data efficiency, real-time processing, and deployment under limited sensing coverage. Complex physical systems are often characterized through the integration of a limited number of point sensors in combination with scientific computations which approximate the dominant, full-state dynamics. Simulation models, however, inevitably neglect small-scale or hidden processes, are sensitive to perturbations, or oversimplify parameter correlations, leading to reconstructions that often diverge from the reality measured by sensors. This creates a critical need for data assimilation, the process of integrating observational data with predictive simulation models to produce coherent and accurate estimates of the full state of complex physical systems. We propose a machine learning framework for Data Assimilation with a SHallow REcurrent Decoder (DA-SHRED) which bridges the simulation-to-real (SIM2REAL) gap between computational modeling and experimental sensor data. For real-world physics systems modeling high-dimensional spatiotemporal fields, where the full state cannot be directly observed and must be inferred from sparse sensor measurements, we leverage the latent space learned from a reduced simulation model via SHRED, and update these latent variables using real sensor data to accurately reconstruct the full system state. Furthermore, our algorithm incorporates a sparse identification of nonlinear dynamics based regression model in the latent space to identify functionals corresponding to missing dynamics in the simulation model. We demonstrate that DA-SHRED successfully closes the SIM2REAL gap and additionally recovers missing dynamics in highly complex systems, demonstrating that the combination of efficient temporal encoding and physics-informed correction enables robust data assimilation.

Paper Structure

This paper contains 26 sections, 4 theorems, 56 equations, 11 figures, 2 algorithms.

Key Result

Theorem 1

Let eq:phs_basic be a PHS on a simply connected domain $\mathcal{X}$. Consider perturbations $\Delta J,\Delta R,\Delta G$ (matrix fields) and a 1-form perturbation $\delta$ so that the perturbed dynamics are Assume: Then there exists $\widetilde{H} = H + \varepsilon$ with $\mathrm{d}\varepsilon=\delta$ and such that the perturbed system can be written globally in PHS form

Figures (11)

  • Figure 1: The algorithmic structure of the DA-SHRED structure. With real physics unobserved, the model exploits the latent space trained on known simulation data using traditional SHRED network, and deploys it on real sensor data to close the discrepancy. It is assumed that the full state-space of the real physics is never observed in practice. The variables, data and models are summarized in the figure table \ref{['fig:table']}.
  • Figure 2: Summary of variables, data and models used in the DA-SHRED formulation. The state space is of dimension $n$, there are $m$ snapshots of temporal measurements using $p$ sensors for SHRED training with an additional $q$ sensors deployed in reality.
  • Figure 3: The result of 2D damped Kuramoto-Sivashinsky equation. Figures on the same row are taken at the same timestep. The left column (a) represents the undamped simulation model (2D KS equation without damping); the middle column (b) represents unknown real physics (2D KS equation with damping); and the right column (c) represents state space restored by DA-SHRED. The error for the DA-SHRED correction to reality is shown in (d) where an order of magnitute improvement in accuracy is achieved within $T=20$.
  • Figure 4: The result of 2D damped Kolmogorov flow. Figures on the same row are taken at the same timestep. The left column represents the undamped simulation model (2D Kolmogorov flow without damping); the middle column represents unknown real physics (2D Kolmogorov flow with linear damping); and the right column represents state space restored by DA-SHRED.
  • Figure 5: The result of 2D damped Grey Scott model. The left figure represent the undamped simulation model; the right one represents the real model and the bottom one represents the DA-SHRED result towards convergence.
  • ...and 6 more figures

Theorems & Definitions (9)

  • Theorem 1: Persistence of PHS form under natural perturbations
  • Proposition 1: Exactness / Poincaré lemma
  • proof
  • Remark 1
  • Proposition 2: Persistence under admissible matrix perturbations
  • proof
  • Theorem 2: Persistence of PHS form under natural perturbations
  • proof
  • Remark 2