Table of Contents
Fetching ...

DAISI: Data Assimilation with Inverse Sampling using Stochastic Interpolants

Martin Andrae, Erik Larsson, So Takao, Tomas Landelius, Fredrik Lindsten

TL;DR

DAISI addresses the data assimilation challenge in high-dimensional, nonlinear settings by leveraging a stationary flow-based prior learned from dynamical systems and integrating it with a forecast ensemble through inverse sampling. The method maps forecast states into latent space via backward SDE inversion and then performs guided forward sampling to produce approximate samples from the filtering distribution, enabling zero-shot conditioning on observations with flexible guidance. Across Lorenz '63, SQG, and SEVIR, DAISI yields accurate, temporally coherent filtering under sparse, noisy, and multimodal observations, often matching or surpassing traditional filters while maintaining ensemble diversity. The framework is modular, scalable, and interpretable through hyperparameters $t_{\min}$ and $\epsilon$, with clear directions for improving exactness and computational efficiency.

Abstract

Data assimilation (DA) is a cornerstone of scientific and engineering applications, combining model forecasts with sparse and noisy observations to estimate latent system states. Classical DA methods, such as the ensemble Kalman filter, rely on Gaussian approximations and heuristic tuning (e.g., inflation and localization) to scale to high dimensions. While often successful, these approximations can make the methods unstable or inaccurate when the underlying distributions of states and observations depart significantly from Gaussianity. To address this limitation, we introduce DAISI, a scalable filtering algorithm built on flow-based generative models that enables flexible probabilistic inference using data-driven priors. The core idea is to use a stationary, pre-trained generative prior to assimilate observations via guidance-based conditional sampling while incorporating forecast information through a novel inverse-sampling step. This step maps the forecast ensemble into a latent space to provide initial conditions for the conditional sampling, allowing us to encode model dynamics into the DA pipeline without having to retrain or fine-tune the generative prior at each assimilation step. Experiments on challenging nonlinear systems show that DAISI achieves accurate filtering results in regimes with sparse, noisy, and nonlinear observations where traditional methods struggle.

DAISI: Data Assimilation with Inverse Sampling using Stochastic Interpolants

TL;DR

DAISI addresses the data assimilation challenge in high-dimensional, nonlinear settings by leveraging a stationary flow-based prior learned from dynamical systems and integrating it with a forecast ensemble through inverse sampling. The method maps forecast states into latent space via backward SDE inversion and then performs guided forward sampling to produce approximate samples from the filtering distribution, enabling zero-shot conditioning on observations with flexible guidance. Across Lorenz '63, SQG, and SEVIR, DAISI yields accurate, temporally coherent filtering under sparse, noisy, and multimodal observations, often matching or surpassing traditional filters while maintaining ensemble diversity. The framework is modular, scalable, and interpretable through hyperparameters and , with clear directions for improving exactness and computational efficiency.

Abstract

Data assimilation (DA) is a cornerstone of scientific and engineering applications, combining model forecasts with sparse and noisy observations to estimate latent system states. Classical DA methods, such as the ensemble Kalman filter, rely on Gaussian approximations and heuristic tuning (e.g., inflation and localization) to scale to high dimensions. While often successful, these approximations can make the methods unstable or inaccurate when the underlying distributions of states and observations depart significantly from Gaussianity. To address this limitation, we introduce DAISI, a scalable filtering algorithm built on flow-based generative models that enables flexible probabilistic inference using data-driven priors. The core idea is to use a stationary, pre-trained generative prior to assimilate observations via guidance-based conditional sampling while incorporating forecast information through a novel inverse-sampling step. This step maps the forecast ensemble into a latent space to provide initial conditions for the conditional sampling, allowing us to encode model dynamics into the DA pipeline without having to retrain or fine-tune the generative prior at each assimilation step. Experiments on challenging nonlinear systems show that DAISI achieves accurate filtering results in regimes with sparse, noisy, and nonlinear observations where traditional methods struggle.

Paper Structure

This paper contains 61 sections, 3 theorems, 97 equations, 34 figures, 11 tables, 1 algorithm.

Key Result

Proposition 3.1

Assume that the marginal laws $\{\rho_t\}_{t \in [0,1]}$ and $\{\rho_t^{\bm{y}}\}_{t \in [0,1]}$ of the interpolants bridging $\rho_0=\mathcal{N}(\boldsymbol{0}, \mathbf{I})$ with $\rho_1 = \mathbb{P}_\infty$ and $\rho_1 = \mathbb{P}_\infty^{\bm{y}}$, respectively, satisfy a log-Sobolev inequality w with equality at $\epsilon = 0$, where $\mathcal{KL}(q || p)$ denotes the Kullback-Leibler divergen

Figures (34)

  • Figure 1: Our method, DAISI, performs data assimilation by combining a flow-based unconditional prior $\mathbb{P}_\infty$ with the forecast ensemble $\hat{\pi}_n$. To condition on an observation ${\bm{y}}_n$, one could apply the guided SDE starting from random noise, producing $\mathbb{P}_\infty^{{\bm{y}}}$ and ignoring the forecast (blue). Instead, DAISI (green): (i) applies the backward SDE to the forecast ensemble, producing inverted samples $\Psi_\sharp \hat{\pi}_n$, and (ii) uses these latents as initial conditions for the guided SDE, generating approximate samples from the filtering distribution $\pi_n$.
  • Figure 2: $\quad t_{\min} = 0.01$
  • Figure 3: $\quad t_{\min} = 0.3$
  • Figure 4: $\quad t_{\min} = 0.6$
  • Figure 6: BPF
  • ...and 29 more figures

Theorems & Definitions (7)

  • Proposition 3.1
  • proof
  • Proposition 1.1
  • Proposition 3.1
  • proof
  • Remark 3.2
  • Remark 3.3