Table of Contents
Fetching ...

Faster Sampling without Isoperimetry via Diffusion-based Monte Carlo

Xunpeng Huang, Difan Zou, Hanze Dong, Yian Ma, Tong Zhang

TL;DR

This work tackles the challenge of sampling from general, potentially non-log-concave targets $p_*(\mathbf{x}) \propto e^{-f_*(\mathbf{x})}$ without isoperimetric guarantees. It introduces RS-DMC, a diffusion-based Monte Carlo method built on Recursive Score Estimation (RSE) that partitions the forward OU diffusion into segments and solves a hierarchy of correlated mean-estimation and sampling subproblems, ensuring each intermediate target is strongly log-concave. The authors prove KL convergence with a quasi-polynomial gradient complexity $\exp\left[\mathcal{O}\left(L^3\cdot \log^3\left(\frac{Ld+M}{\epsilon}\right)\right)\right]$ and compare against ULA and RDS, highlighting improved efficiency and broader applicability. Empirical results on a multi-modal 2D target show RS-DMC achieves better mode coverage and faster convergence than standard Langevin-based methods, with RS-DMC-v2 balancing global exploration and local refinement.

Abstract

To sample from a general target distribution $p_*\propto e^{-f_*}$ beyond the isoperimetric condition, Huang et al. (2023) proposed to perform sampling through reverse diffusion, giving rise to Diffusion-based Monte Carlo (DMC). Specifically, DMC follows the reverse SDE of a diffusion process that transforms the target distribution to the standard Gaussian, utilizing a non-parametric score estimation. However, the original DMC algorithm encountered high gradient complexity, resulting in an exponential dependency on the error tolerance $ε$ of the obtained samples. In this paper, we demonstrate that the high complexity of DMC originates from its redundant design of score estimation, and proposed a more efficient algorithm, called RS-DMC, based on a novel recursive score estimation method. In particular, we first divide the entire diffusion process into multiple segments and then formulate the score estimation step (at any time step) as a series of interconnected mean estimation and sampling subproblems accordingly, which are correlated in a recursive manner. Importantly, we show that with a proper design of the segment decomposition, all sampling subproblems will only need to tackle a strongly log-concave distribution, which can be very efficient to solve using the Langevin-based samplers with a provably rapid convergence rate. As a result, we prove that the gradient complexity of RS-DMC only has a quasi-polynomial dependency on $ε$, which significantly improves exponential gradient complexity in Huang et al. (2023). Furthermore, under commonly used dissipative conditions, our algorithm is provably much faster than the popular Langevin-based algorithms. Our algorithm design and theoretical framework illuminate a novel direction for addressing sampling problems, which could be of broader applicability in the community.

Faster Sampling without Isoperimetry via Diffusion-based Monte Carlo

TL;DR

This work tackles the challenge of sampling from general, potentially non-log-concave targets without isoperimetric guarantees. It introduces RS-DMC, a diffusion-based Monte Carlo method built on Recursive Score Estimation (RSE) that partitions the forward OU diffusion into segments and solves a hierarchy of correlated mean-estimation and sampling subproblems, ensuring each intermediate target is strongly log-concave. The authors prove KL convergence with a quasi-polynomial gradient complexity and compare against ULA and RDS, highlighting improved efficiency and broader applicability. Empirical results on a multi-modal 2D target show RS-DMC achieves better mode coverage and faster convergence than standard Langevin-based methods, with RS-DMC-v2 balancing global exploration and local refinement.

Abstract

To sample from a general target distribution beyond the isoperimetric condition, Huang et al. (2023) proposed to perform sampling through reverse diffusion, giving rise to Diffusion-based Monte Carlo (DMC). Specifically, DMC follows the reverse SDE of a diffusion process that transforms the target distribution to the standard Gaussian, utilizing a non-parametric score estimation. However, the original DMC algorithm encountered high gradient complexity, resulting in an exponential dependency on the error tolerance of the obtained samples. In this paper, we demonstrate that the high complexity of DMC originates from its redundant design of score estimation, and proposed a more efficient algorithm, called RS-DMC, based on a novel recursive score estimation method. In particular, we first divide the entire diffusion process into multiple segments and then formulate the score estimation step (at any time step) as a series of interconnected mean estimation and sampling subproblems accordingly, which are correlated in a recursive manner. Importantly, we show that with a proper design of the segment decomposition, all sampling subproblems will only need to tackle a strongly log-concave distribution, which can be very efficient to solve using the Langevin-based samplers with a provably rapid convergence rate. As a result, we prove that the gradient complexity of RS-DMC only has a quasi-polynomial dependency on , which significantly improves exponential gradient complexity in Huang et al. (2023). Furthermore, under commonly used dissipative conditions, our algorithm is provably much faster than the popular Langevin-based algorithms. Our algorithm design and theoretical framework illuminate a novel direction for addressing sampling problems, which could be of broader applicability in the community.
Paper Structure (46 sections, 34 theorems, 265 equations, 3 figures, 2 tables, 2 algorithms)

This paper contains 46 sections, 34 theorems, 265 equations, 3 figures, 2 tables, 2 algorithms.

Key Result

Lemma 2.1

For any $k\in \mathbb{N}_{0,K-1}$ and $t\in[0,S]$, the score function can be written as where the conditional density function $q_{k,S-t}(\cdot | {\bm{x}})$ is defined as

Figures (3)

  • Figure 1: The illustration of SDE \ref{['con_eq:rrds_forward']} and \ref{['con_eq:rrds_actual_backward']}, covering the definitions in Section \ref{['sec:pre']}. The top of the figure describes the underlying distribution of the segmented OU process, i.e., SDE \ref{['con_eq:rrds_forward']}, and the bottom presents the corresponding distribution in the segmented OU process, i.e., SDE \ref{['con_eq:rrds_actual_backward']}. For the intermediate part, the upper half describes the gradients of the log densities along the forward SDE \ref{['con_eq:rrds_forward']}, while the lower half describes approximated scores used to update particles in the reverse SDE \ref{['con_eq:rrds_actual_backward']}.
  • Figure 2: The illustration of recursive score estimation (RSE). The upper half presents RSE from a local view, which shows how to utilize the former score, e.g., $\nabla\log p_{k,0}({\bm{x}}^\prime)$ to update particles by ULA in the sampling subproblem formulated by the latter score, e.g., $\nabla\log p_{k,S-t}({\bm{x}})$. The lower half presents RSE from a global view, which is a series of interconnected mean estimation and sampling subproblems accordingly.
  • Figure 3: Illustration of the returned particles for ULA, RS-DMC-v1 and RS-DMC-v2 shown with orange particles and the blue ones sampled from the ground truth. The first row is returned by ULA, the second is RS-DMC-v1 and the last is from RS-DMC-v2. Experimental results show that ULA converges fast in the local regions of modes, while it suffers from the problem of covering all modes. RS-DMC-v1 can cover most modes with few gradient oracles but converge slowly in local regions. RS-DMC-v2 takes advantage of both ULA and RS-DMC-v1, which can cover most modes and admit a faster local convergence.

Theorems & Definitions (58)

  • Lemma 2.1: Lemma 1 of huang2023monte
  • Theorem 4.1: Gradient complexity of $\text{RS-DMC}$, informal
  • Definition 1: Logarithmic Sobolev inequality
  • Definition 2: Poincaré inequality
  • Theorem B.1
  • proof : Proof of Theorem \ref{['thm:main_rrds_formal']}
  • Corollary B.2
  • proof
  • Lemma C.1: Lemma 11 in vempala2019rapid
  • Lemma C.2
  • ...and 48 more