Accelerating Langevin Sampling with Birth-death
Yulong Lu, Jianfeng Lu, James Nolen
TL;DR
The paper tackles the challenge of sampling from multimodal distributions by introducing a birth-death mechanism that accelerates Langevin diffusion. It formulates a nonlocal birth-death term in the Fokker-Planck equation and reveals a KL-divergence gradient-flow structure under the Wasserstein-Fisher-Rao metric. The authors prove that, under mild assumptions, the asymptotic convergence rate becomes independent of potential barriers, and they validate the approach with analytical examples and a practical interacting-particle algorithm (BDLS) alongside numerical experiments on torus, Gaussian mixtures, and Bayesian GMMs. The work offers a scalable, globally moving mass mechanism that enhances mixing across modes and provides a foundation for combining birth-death dynamics with other sampling schemes.
Abstract
A fundamental problem in Bayesian inference and statistical machine learning is to efficiently sample from multimodal distributions. Due to metastability, multimodal distributions are difficult to sample using standard Markov chain Monte Carlo methods. We propose a new sampling algorithm based on a birth-death mechanism to accelerate the mixing of Langevin diffusion. Our algorithm is motivated by its mean field partial differential equation (PDE), which is a Fokker-Planck equation supplemented by a nonlocal birth-death term. This PDE can be viewed as a gradient flow of the Kullback-Leibler divergence with respect to the Wasserstein-Fisher-Rao metric. We prove that under some assumptions the asymptotic convergence rate of the nonlocal PDE is independent of the potential barrier, in contrast to the exponential dependence in the case of the Langevin diffusion. We illustrate the efficiency of the birth-death accelerated Langevin method through several analytical examples and numerical experiments.
