Table of Contents
Fetching ...

D-Flow SGLD: Source-Space Posterior Sampling for Scientific Inverse Problems with Flow Matching

Meet Hemant Parikh, Yaqin Chen, Jian-Xun Wang

TL;DR

This work studies training-free conditional generation for scientific inverse problems under FM priors and proposes D-Flow SGLD, a source-space posterior sampling method that augments differentiable source inference with preconditioned stochastic gradient Langevin dynamics, enabling scalable exploration of the source posterior induced by new measurement operators without retraining the prior or modifying the learned FM dynamics.

Abstract

Data assimilation and scientific inverse problems require reconstructing high-dimensional physical states from sparse and noisy observations, ideally with uncertainty-aware posterior samples that remain faithful to learned priors and governing physics. While training-free conditional generation is well developed for diffusion models, corresponding conditioning and posterior sampling strategies for Flow Matching (FM) priors remain comparatively under-explored, especially on scientific benchmarks where fidelity must be assessed beyond measurement misfit. In this work, we study training-free conditional generation for scientific inverse problems under FM priors and organize existing inference-time strategies by where measurement information is injected: (i) guided transport dynamics that perturb sampling trajectories using likelihood information, and (ii) source-distribution inference that performs posterior inference over the source variable while keeping the learned transport fixed. Building on the latter, we propose D-Flow SGLD, a source-space posterior sampling method that augments differentiable source inference with preconditioned stochastic gradient Langevin dynamics, enabling scalable exploration of the source posterior induced by new measurement operators without retraining the prior or modifying the learned FM dynamics. We benchmark representative methods from both families on a hierarchy of problems: 2D toy posteriors, chaotic Kuramoto-Sivashinsky trajectories, and wall-bounded turbulence reconstruction. Across these settings, we quantify trade-offs among measurement assimilation, posterior diversity, and physics/statistics fidelity, and establish D-Flow SGLD as a practical FM-compatible posterior sampler for scientific inverse problems.

D-Flow SGLD: Source-Space Posterior Sampling for Scientific Inverse Problems with Flow Matching

TL;DR

This work studies training-free conditional generation for scientific inverse problems under FM priors and proposes D-Flow SGLD, a source-space posterior sampling method that augments differentiable source inference with preconditioned stochastic gradient Langevin dynamics, enabling scalable exploration of the source posterior induced by new measurement operators without retraining the prior or modifying the learned FM dynamics.

Abstract

Data assimilation and scientific inverse problems require reconstructing high-dimensional physical states from sparse and noisy observations, ideally with uncertainty-aware posterior samples that remain faithful to learned priors and governing physics. While training-free conditional generation is well developed for diffusion models, corresponding conditioning and posterior sampling strategies for Flow Matching (FM) priors remain comparatively under-explored, especially on scientific benchmarks where fidelity must be assessed beyond measurement misfit. In this work, we study training-free conditional generation for scientific inverse problems under FM priors and organize existing inference-time strategies by where measurement information is injected: (i) guided transport dynamics that perturb sampling trajectories using likelihood information, and (ii) source-distribution inference that performs posterior inference over the source variable while keeping the learned transport fixed. Building on the latter, we propose D-Flow SGLD, a source-space posterior sampling method that augments differentiable source inference with preconditioned stochastic gradient Langevin dynamics, enabling scalable exploration of the source posterior induced by new measurement operators without retraining the prior or modifying the learned FM dynamics. We benchmark representative methods from both families on a hierarchy of problems: 2D toy posteriors, chaotic Kuramoto-Sivashinsky trajectories, and wall-bounded turbulence reconstruction. Across these settings, we quantify trade-offs among measurement assimilation, posterior diversity, and physics/statistics fidelity, and establish D-Flow SGLD as a practical FM-compatible posterior sampler for scientific inverse problems.
Paper Structure (40 sections, 44 equations, 19 figures, 3 tables, 1 algorithm)

This paper contains 40 sections, 44 equations, 19 figures, 3 tables, 1 algorithm.

Figures (19)

  • Figure 1: S-curve toy inverse problem. Conditional posterior samples ($N=1000$, orange) overlaid on unconditional OT-CFM samples (gray). For velocity-field guidance, the correction strength is $b = 3$.
  • Figure 2: Moon-curve toy inverse problem. Conditional posterior samples ($N=1000$, orange) overlaid on unconditional OT-CFM samples (gray). For velocity-field guidance, the correction strength is $b = 3$
  • Figure 3: KS temporal forecasting with noise-free measurements. (top row) The ground truth measurement is shown in the first column, followed by a conditionally generated sample from each method. The region below the white line ($t>r$) is the forecasting portion. (middle row) Pointwise data assimilation error (MAE). (bottom row) The PDE residual, $R_j^n$, computed over the entire generated field to assess physical plausibility.
  • Figure 4: KS temporal forecasting with 10% noise corruption. (top row) The ground truth measurement is shown in the first column, followed by a conditionally generated sample from each method. The region below the white line ($t>r$) is the forecasting portion. (middle row) Pointwise data assimilation error (MAE). (bottom row) The PDE residual, $R_j^n$, computed over the entire generated field to assess physical plausibility.
  • Figure 5: Mean absolute PDE residual, $\overline{R}$, for the different guidance methods applied in KS system. The results are shown for two inverse problems: temporal forecasting (a-b) and reconstruction from sparse sensors (c-d). Each problem is tested with both noise-free measurements (a, c) and measurements corrupted by $10\%$ noise (b, d). Bar heights represent the mean residual, and error bars denote the standard deviation across generated samples.
  • ...and 14 more figures