Convergence of Noise-Free Sampling Algorithms with Regularized Wasserstein Proximals
Fuqun Han, Stanley Osher, Wuchen Li
TL;DR
The paper develops and analyzes BRWP, a deterministic, semi-implicit method for sampling from strongly log-concave distributions by discretizing the probability flow ODE using a kernel derived from the regularized Wasserstein proximal operator. It proves a second-order weak accuracy of the kernel, uniform local regularity of BRWP iterates, and a KL-divergence contraction per step, yielding explicit step-size and mixing-time bounds. The work also discusses practical score-approximation strategies and demonstrates improved convergence and reduced bias compared to ULA and proximal Langevin in numerical experiments. Overall, BRWP provides a stable, efficient alternative to traditional Langevin-type samplers with rigorous convergence guarantees under the stated regularity assumptions.
Abstract
In this work, we investigate the convergence properties of the backward regularized Wasserstein proximal (BRWP) method for sampling a target distribution. The BRWP approach can be shown as a semi-implicit time discretization for a probability flow ODE with the score function whose density satisfies the Fokker-Planck equation of the overdamped Langevin dynamics. Specifically, the evolution of the density, hence the score function, is approximated via a kernel representation derived from the regularized Wasserstein proximal operator. By applying the dual formulation and a localized Taylor series to obtain the asymptotic expansion of this kernel formula, we establish guaranteed convergence in terms of the Kullback-Leibler divergence for the BRWP method towards a strongly log-concave target distribution. Our analysis also identifies the optimal and maximum step sizes for convergence. Furthermore, we demonstrate that the deterministic and semi-implicit BRWP scheme outperforms many classical Langevin Monte Carlo methods, such as the Unadjusted Langevin Algorithm (ULA), by offering faster convergence and reduced bias. Numerical experiments further validate the convergence analysis of the BRWP method.
