Table of Contents
Fetching ...

Preconditioned One-Step Generative Modeling for Bayesian Inverse Problems in Function Spaces

Zilan Cheng, Li-Lian Wang, Zhongjian Wang

Abstract

We propose a machine-learning algorithm for Bayesian inverse problems in the function-space regime based on one-step generative transport. Building on the Mean Flows, we learn a fully conditional amortized sampler with a neural-operator backbone that maps a reference Gaussian noise to approximate posterior samples. We show that while white-noise references may be admissible at fixed discretization, they become incompatible with the function-space limit, leading to instability in inference for Bayesian problems arising from PDEs. To address this issue, we adopt a prior-aligned anisotropic Gaussian reference distribution and establish the Lipschitz regularity of the resulting transport. Our method is not distilled from MCMC: training relies only on prior samples and simulated partial and noisy observations. Once trained, it generates a $64\times64$ posterior sample in $\sim 10^{-3}$s, avoiding the repeated PDE solves of MCMC while matching key posterior summaries.

Preconditioned One-Step Generative Modeling for Bayesian Inverse Problems in Function Spaces

Abstract

We propose a machine-learning algorithm for Bayesian inverse problems in the function-space regime based on one-step generative transport. Building on the Mean Flows, we learn a fully conditional amortized sampler with a neural-operator backbone that maps a reference Gaussian noise to approximate posterior samples. We show that while white-noise references may be admissible at fixed discretization, they become incompatible with the function-space limit, leading to instability in inference for Bayesian problems arising from PDEs. To address this issue, we adopt a prior-aligned anisotropic Gaussian reference distribution and establish the Lipschitz regularity of the resulting transport. Our method is not distilled from MCMC: training relies only on prior samples and simulated partial and noisy observations. Once trained, it generates a posterior sample in s, avoiding the repeated PDE solves of MCMC while matching key posterior summaries.
Paper Structure (58 sections, 6 theorems, 89 equations, 6 figures, 9 tables, 6 algorithms)

This paper contains 58 sections, 6 theorems, 89 equations, 6 figures, 9 tables, 6 algorithms.

Key Result

Theorem 3.2

Assume the Gaussian-tail condition in Assumption assu:gaussian_tail_function_space. The averaged velocity $w(\cdot;0,1)$ is globally Lipschitz on $\mathcal{H}$ and the one-step map is globally Lipschitz with $\mathrm{Lip}(\mathcal{T}(\cdot;y_{\mathrm{obs}}))\le 1+\tfrac{L_0}{2}$, where $L_0$ depends only on the Gaussian-tail parameters.

Figures (6)

  • Figure 1: Energy spectra $E(k)$ for the identity inverse problem. Top: Encode--only training (loss in encoded space). Bottom: Encode--Decode training (loss in physical space). For reference, we also plot the ground-truth closed-form posterior $\pi_{\mathrm{exact}}$ (Appendix \ref{['app:identity_posterior']}) and the prior spectrum. The curves labeled $(\mathcal{T}_\theta)_\#\rho_A$ and $(\mathcal{T}_\theta)_\#\rho_W$ denote model-approximated posteriors obtained by transporting the anisotropic reference $\rho_A=\gamma$ and the white-noise reference $\rho_W$, respectively, where $\gamma$ is the trace-class prior.
  • Figure 2: Conditional neural operator backbone for one-step posterior sampling. A latent/reference draw $\xi\sim\rho$ (with $\rho=\gamma$) in our main prior-aligned trace-class setting) is mapped by a lifting operator $\mathcal{P}$, $L$ conditional operator layers, and a projection operator $\mathcal{Q}$. During training we sample time pairs $0\le r\le t\le 1$; at inference we evaluate the learned map at $(r,t)=(0,1)$ to obtain a one-step transport that generates approximate posterior samples.
  • Figure 3: Darcy Equation. 3$\times$3 visualization: (Top) ground truth and observations, (Bottom) posterior means. Reference:$\rho_A=\mathcal{N}\bigl(0,(-\Delta_\mathrm{Neumann}+9I)^{-2}\bigr)$. Noise level: $\sigma_\eta=1$.
  • Figure 4: Advection Equation. 3$\times$3 visualization: (Top) ground truth and observations, (Bottom) posterior means. Reference:$\rho_A=\mathcal{N}\bigl(0,(-\Delta_\mathrm{per}+9I)^{-2}\bigr)$. Noise level: $\sigma_\eta=0.05$.
  • Figure 5: Reaction-Diffusion Equation. 3$\times$3 visualization: (Top) ground truth and observations, (Bottom) posterior means. Reference: $\rho_A=\mathcal{N}\bigl(0,(-\Delta_\mathrm{per}+9I)^{-2}\bigr)$. Noise level: $\sigma_\eta=0.05$.
  • ...and 1 more figures

Theorems & Definitions (16)

  • Remark 3.1: Normalizer
  • Theorem 3.2: Informal
  • Proposition 3.3: White vs. trace-class Gaussian are mutually singular
  • Remark 4.1: Fair comparison and KL dimension
  • Remark A.1: Choice of KL truncation
  • Remark A.2: Variational interpretation and MAP
  • Remark A.3: Chain length and stability criterion
  • Theorem C.2: Dimension-free Lipschitz Föllmer velocity under Gaussian tails meng2025pathway
  • Remark C.3: Function-space interpretation
  • Theorem C.4: Averaged velocity is Lipschitz
  • ...and 6 more