Table of Contents
Fetching ...

Conditional Flow Matching for Bayesian Posterior Inference

So Won Jeong, Percy S. Zhai, Veronika Ročková

TL;DR

This work develops a likelihood-free Bayesian posterior sampler by framing posterior inference as learning a dynamic block-triangular transport on the joint space $\mathcal{Y} \times \Theta$ via conditional flow matching. A velocity field $v_t$ drives a continuous-time flow that transports a base joint distribution to the target posterior, with monotonicity constraints yielding a conditional Brenier map and Monge–Kantorovich depth-based credible sets, plus accessible inverse maps for posterior ranks. The authors provide rigorous guarantees: a block-triangular mapping property, $W_2$-consistency of the learned posterior, and convergence of MK-depth-based credible sets, all while enabling one-time training with amortized inference. Empirically, the method shows flexibility to complex geometries, scalable performance with data dimension, and competitive results compared to GAN- and diffusion-based approaches, highlighting its practical potential for likelihood-free posterior inference and uncertainty quantification.

Abstract

We propose a generative multivariate posterior sampler via flow matching. It offers a simple training objective, and does not require access to likelihood evaluation. The method learns a dynamic, block-triangular velocity field in the joint space of data and parameters, which results in a deterministic transport map from a source distribution to the desired posterior. The inverse map, named vector rank, is accessible by reversibly integrating the velocity over time. It is advantageous to leverage the dynamic design: proper constraints on the velocity yield a monotone map, which leads to a conditional Brenier map, enabling a fast and simultaneous generation of Bayesian credible sets whose contours correspond to level sets of Monge-Kantorovich data depth. Our approach is computationally lighter compared to GAN-based and diffusion-based counterparts, and is capable of capturing complex posterior structures. Finally, frequentist theoretical guarantee on the consistency of the recovered posterior distribution, and of the corresponding Bayesian credible sets, is provided.

Conditional Flow Matching for Bayesian Posterior Inference

TL;DR

This work develops a likelihood-free Bayesian posterior sampler by framing posterior inference as learning a dynamic block-triangular transport on the joint space via conditional flow matching. A velocity field drives a continuous-time flow that transports a base joint distribution to the target posterior, with monotonicity constraints yielding a conditional Brenier map and Monge–Kantorovich depth-based credible sets, plus accessible inverse maps for posterior ranks. The authors provide rigorous guarantees: a block-triangular mapping property, -consistency of the learned posterior, and convergence of MK-depth-based credible sets, all while enabling one-time training with amortized inference. Empirically, the method shows flexibility to complex geometries, scalable performance with data dimension, and competitive results compared to GAN- and diffusion-based approaches, highlighting its practical potential for likelihood-free posterior inference and uncertainty quantification.

Abstract

We propose a generative multivariate posterior sampler via flow matching. It offers a simple training objective, and does not require access to likelihood evaluation. The method learns a dynamic, block-triangular velocity field in the joint space of data and parameters, which results in a deterministic transport map from a source distribution to the desired posterior. The inverse map, named vector rank, is accessible by reversibly integrating the velocity over time. It is advantageous to leverage the dynamic design: proper constraints on the velocity yield a monotone map, which leads to a conditional Brenier map, enabling a fast and simultaneous generation of Bayesian credible sets whose contours correspond to level sets of Monge-Kantorovich data depth. Our approach is computationally lighter compared to GAN-based and diffusion-based counterparts, and is capable of capturing complex posterior structures. Finally, frequentist theoretical guarantee on the consistency of the recovered posterior distribution, and of the corresponding Bayesian credible sets, is provided.

Paper Structure

This paper contains 42 sections, 6 theorems, 76 equations, 11 figures, 2 tables, 1 algorithm.

Key Result

Lemma 1

The dynamic map $T_t$ is block-triangular in the form of eq:dynamic.triangular.map for all $t\in [0,1]$ if the velocity field can be expressed in the block-triangular form of eq:triangular.velocity.

Figures (11)

  • Figure 1: The traditional MCMC methods fail to search the entire region. The sampling time for each method is noted in the respective plot title. Our method requires one-time training (7.5 s); per sampling cost is nearly instantaneous and the model captures the underlying geometry.
  • Figure 2: Posterior estimation for Gaussian conjugate model. Our model scales better than M-GAN with the increasing number of observations ($n \geq 64$).
  • Figure 3: The panel (a) shows the credible sets with varying $\tau$ level ($\tau \in \{0.5, 0.6, 0.7, 0.8, 0.9, 0.95\}$) for gaussian conjugate model with dimension $n=8$. With monotonicity, we obtain non-crossing nested credible sets. In panel (b), each boxplot shows the distribution of empirical posterior mass contained in the estimated $\tau$–level credible set across 100 simulations. The red dashed line denotes the ideal $y=x$ calibration.
  • Figure 4: Posterior recovery of Neal's funnel (\ref{['eq:neals.funnel']}) at different observation $x = [-5, 1, 10, 25]$, compared with standard Metropolis-Hastings (MH) sampler (first row). Overall, the learned posterior by flow (second row) matches with MH estimations across the practical range of $x$.
  • Figure 5: The prior for the infection rate $\beta$ is much more dispersed than the prior for the recovery rate $\gamma$: the sample variance of $\beta$ is 0.053 whereas $\gamma$ is only 0.0007 (ratio $\approx 79 : 1$). Among the seven handcrafted summary statistics, dispersion differs by two orders of magnitude: the variance of the epidemic peak size (peak I) is 166, while the growth-slope and correlation summaries are all below 0.05. Point-wise trajectory variances $\text{Var}\bigl(S_t,I_t,R_t\bigr)$ are shown on a $\log(1+\mathrm{var})$ scale. The susceptible (S) and recovered (R) compartments start around 0.11 and decay steadily, whereas the infectious component (I) remains almost deterministic throughout, never exceeding 0.003 (below the first colour step). Together, the three panels illustrate that the statistical scales relevant to learning the joint flow range from $10^{-3}$ (I-variance) to $10^{2}$ (peak I variance).
  • ...and 6 more figures

Theorems & Definitions (6)

  • Lemma 1
  • Theorem 1: Posterior Sampler Accuracy
  • Theorem 2: Consistency of Posterior
  • Corollary 3: Consistency of Credible Sets
  • Proposition 4: MK Conditional Rank
  • Lemma 2: Instantaneous Change of Variables chen2018neural