Table of Contents
Fetching ...

Sampling from multi-modal distributions with polynomial query complexity in fixed dimension via reverse diffusion

Adrien Vacher, Omar Chehab, Anna Korba

TL;DR

This work tackles the challenge of drawing samples from unnormalized, multi-modal densities μ ∝ e^{-V} in fixed dimension. It introduces a reverse-diffusion based sampler that converts sampling into score estimation along the forward Ornstein–Uhlenbeck process and uses a self-normalized Monte Carlo estimator to approximate intermediate scores from access to V only. Under semi-log-convexity and dissipativity, the authors prove a non-asymptotic, polynomial-time guarantee for obtaining samples with small KL error, without requiring prior knowledge of problem constants and without metastability. They further show that general Gaussian mixtures satisfy the required regularity, yielding practical polynomial bounds in mixture-conditioned parameters, and demonstrate favorable empirical performance against standard baselines. The results bridge diffusion-based sampling theory with concrete guarantees for low-dimensional multi-modality, with potential impact on Bayesian inference and probabilistic modeling in settings where multi-modal posteriors are common.

Abstract

Even in low dimensions, sampling from multi-modal distributions is challenging. We provide the first sampling algorithm for a broad class of distributions -- including all Gaussian mixtures -- with a query complexity that is polynomial in the parameters governing multi-modality, assuming fixed dimension. Our sampling algorithm simulates a time-reversed diffusion process, using a self-normalized Monte Carlo estimator of the intermediate score functions. Unlike previous works, it avoids metastability, requires no prior knowledge of the mode locations, and relaxes the well-known log-smoothness assumption which excluded general Gaussian mixtures so far.

Sampling from multi-modal distributions with polynomial query complexity in fixed dimension via reverse diffusion

TL;DR

This work tackles the challenge of drawing samples from unnormalized, multi-modal densities μ ∝ e^{-V} in fixed dimension. It introduces a reverse-diffusion based sampler that converts sampling into score estimation along the forward Ornstein–Uhlenbeck process and uses a self-normalized Monte Carlo estimator to approximate intermediate scores from access to V only. Under semi-log-convexity and dissipativity, the authors prove a non-asymptotic, polynomial-time guarantee for obtaining samples with small KL error, without requiring prior knowledge of problem constants and without metastability. They further show that general Gaussian mixtures satisfy the required regularity, yielding practical polynomial bounds in mixture-conditioned parameters, and demonstrate favorable empirical performance against standard baselines. The results bridge diffusion-based sampling theory with concrete guarantees for low-dimensional multi-modality, with potential impact on Bayesian inference and probabilistic modeling in settings where multi-modal posteriors are common.

Abstract

Even in low dimensions, sampling from multi-modal distributions is challenging. We provide the first sampling algorithm for a broad class of distributions -- including all Gaussian mixtures -- with a query complexity that is polynomial in the parameters governing multi-modality, assuming fixed dimension. Our sampling algorithm simulates a time-reversed diffusion process, using a self-normalized Monte Carlo estimator of the intermediate score functions. Unlike previous works, it avoids metastability, requires no prior knowledge of the mode locations, and relaxes the well-known log-smoothness assumption which excluded general Gaussian mixtures so far.
Paper Structure (41 sections, 18 theorems, 180 equations, 2 figures, 1 table)

This paper contains 41 sections, 18 theorems, 180 equations, 2 figures, 1 table.

Key Result

Theorem 1

[Main result, informal] Suppose that Assumption assumption:semi_convexity and assumption:dissipativity hold. Then, for all $\epsilon>0$, there exists a stochastic algorithm whose parameters only depend on $\epsilon$ (and not on the parameters of the problem), that outputs a sample $X \sim \hat{p}$

Figures (2)

  • Figure 1: Error in Wasserstein distance as a function of the between-mode distance.
  • Figure 2: From left to right: our algorithm vs. ULA vs. huang2024reverse. The color scheme indicates the probability density value of the distribution we want to sample from (dark is low probability density, bright is high probability). The blue dots are the samples produced by the algorithm.

Theorems & Definitions (34)

  • Theorem 1
  • Proposition 2
  • Theorem 3: conforti2024klconvergenceguaranteesscore
  • Proposition 4: Non-asymptotic bound on the quadratic error
  • Remark 5
  • Proposition 6: Regularity bounds on the forward process
  • Lemma 7: Bounds on the moments of the ratio
  • Lemma 8
  • Theorem 9
  • Proposition 10
  • ...and 24 more