Table of Contents
Fetching ...

An Optimal Transport-Based Generative Model for Bayesian Posterior Sampling

Ke Li, Wei Han, Yuexi Wang, Yun Yang

TL;DR

A new generative modeling approach based on optimal transport that learns a deterministic map from a reference distribution to the target posterior through constrained optimization and allows efficient generation of many independent, high-quality posterior samples in Bayesian inference.

Abstract

We investigate the problem of sampling from posterior distributions with intractable normalizing constants in Bayesian inference. Our solution is a new generative modeling approach based on optimal transport (OT) that learns a deterministic map from a reference distribution to the target posterior through constrained optimization. The method uses structural constraints from OT theory to ensure uniqueness of the solution and allows efficient generation of many independent, high-quality posterior samples. The framework supports both continuous and mixed discrete-continuous parameter spaces, with specific adaptations for latent variable models and near-Gaussian posteriors. Beyond computational benefits, it also enables new inferential tools based on OT-derived multivariate ranks and quantiles for Bayesian exploratory analysis and visualization. We demonstrate the effectiveness of our approach through multiple simulation studies and a real-world data analysis.

An Optimal Transport-Based Generative Model for Bayesian Posterior Sampling

TL;DR

A new generative modeling approach based on optimal transport that learns a deterministic map from a reference distribution to the target posterior through constrained optimization and allows efficient generation of many independent, high-quality posterior samples in Bayesian inference.

Abstract

We investigate the problem of sampling from posterior distributions with intractable normalizing constants in Bayesian inference. Our solution is a new generative modeling approach based on optimal transport (OT) that learns a deterministic map from a reference distribution to the target posterior through constrained optimization. The method uses structural constraints from OT theory to ensure uniqueness of the solution and allows efficient generation of many independent, high-quality posterior samples. The framework supports both continuous and mixed discrete-continuous parameter spaces, with specific adaptations for latent variable models and near-Gaussian posteriors. Beyond computational benefits, it also enables new inferential tools based on OT-derived multivariate ranks and quantiles for Bayesian exploratory analysis and visualization. We demonstrate the effectiveness of our approach through multiple simulation studies and a real-world data analysis.

Paper Structure

This paper contains 50 sections, 9 theorems, 84 equations, 14 figures, 6 tables.

Key Result

Lemma 1

Assume that each element $T\in\mathcal{T}$ is differentiable and invertible on the support of $\mu$. Then optimization problem in eqn:KL_Form is equivalent to where $J_T\in\mathbb R^{p\times p}$ denotes the Jacobian matrix associated with map $T$ from $\mathbb R^p$ to $\mathbb R^p$ and $\det (A)$ denotes the determinant of a square matrix $A$.

Figures (14)

  • Figure 1: Center-outward quantile contours for the mixture of two bivariate Gaussian example. Here we include the quantile contours by passing the reference contours through the transport map estimated by our method (with $L=2$), Planar, ICNN and the triangular map.
  • Figure 2: Comparison of draws generated from the target distribution and different approximate methods when $d=20$ and $K=10$.
  • Figure 3: Difference ratio of $95\%$ credible intervals for parameters and $95\%$ prediction intervals of new data points, and the standardized 2-Wasserstein distance for posterior distributions under transport map methods compared with the Gibbs posterior.
  • Figure 4: Posterior summary for $(\beta_9,\beta_{10})$ and simultaneous credible intervals for all $\beta_j$'s. Here we include: (a) marginal density of $\beta_9$ and $\beta_{10}$ produced by Gibbs posterior and our method (OT); (b) the 95% credible region by OT (projected onto $(\beta_9, \beta_{10})$); and (c) all 95% simultaneous credible intervals (CIs)
  • Figure 5: Center-outward quantile contours for the Banana distribution. Here we include the plot from the reference distribution $\mathcal{N}(0, I_2)$ and the corresponding contours by passing the reference contours through the transport map estimated by our method (with $L=1$), Planar, ICNN with softplus activation, Triangular Map and Neural Spline Flows (NSF). We also include axes of the reference distribution to show directional information.
  • ...and 9 more figures

Theorems & Definitions (13)

  • Lemma 1
  • Lemma 2: OT map existence and regularity
  • Remark 1: Multivariate extension of monotonicity
  • Remark 2: OT map is the most parsimonious map
  • Lemma 3: Target distribution with disconnected support from kitagawa2019free
  • Lemma 4
  • Remark 3: Embedding of discrete components
  • Theorem 1
  • Lemma 5: Optimization objective for mixed parameters
  • Lemma 6: Affine transport maps for nearly Gaussian posteriors
  • ...and 3 more