Diffusive Gibbs Sampling
Wenlin Chen, Mingtian Zhang, Brooks Paige, José Miguel Hernández-Lobato, David Barber
TL;DR
Diffusive Gibbs Sampling (DiGS) addresses the challenge of sampling from multi-modal unnormalized targets by pairing Gaussian convolution with a Metropolis within Gibbs scheme on the joint space $p(x,\tilde{x})=p(\tilde{x}|x)p(x)$. By alternately sampling the noisy variable $\tilde{x}$ and the denoised variable $x$, and by introducing a MH-based initialization for the denoising step, DiGS achieves robust mode exploration without requiring the intractable convolved score. The paper demonstrates strong empirical gains over standard MCMC baselines (MALA, HMC, parallel tempering) on synthetic MoG problems, Bayesian neural networks, and molecular configuration sampling, including substantial reductions in energy evaluations for MD-like tasks. A multi-level, variance-preserving noise schedule further enhances efficiency, and the method is positioned as a practical, scalable auxiliary-variable MCMC family member with clear avenues for future theoretical and methodological improvements.
Abstract
The inadequate mixing of conventional Markov Chain Monte Carlo (MCMC) methods for multi-modal distributions presents a significant challenge in practical applications such as Bayesian inference and molecular dynamics. Addressing this, we propose Diffusive Gibbs Sampling (DiGS), an innovative family of sampling methods designed for effective sampling from distributions characterized by distant and disconnected modes. DiGS integrates recent developments in diffusion models, leveraging Gaussian convolution to create an auxiliary noisy distribution that bridges isolated modes in the original space and applying Gibbs sampling to alternately draw samples from both spaces. A novel Metropolis-within-Gibbs scheme is proposed to enhance mixing in the denoising sampling step. DiGS exhibits a better mixing property for sampling multi-modal distributions than state-of-the-art methods such as parallel tempering, attaining substantially improved performance across various tasks, including mixtures of Gaussians, Bayesian neural networks and molecular dynamics.
