Constrained Exploration via Reflected Replica Exchange Stochastic Gradient Langevin Dynamics
Haoyang Zheng, Hengrong Du, Qi Feng, Wei Deng, Guang Lin
TL;DR
This work introduces Reflected Replica Exchange Langevin Diffusion (r2LD) to perform constrained non-convex exploration within a bounded domain, combining reflection with temperature-swapped Langevin dynamics. The authors prove rigorous convergence guarantees in $\chi^2$-divergence and $W_2$ distance, with rates that scale quadratically with the domain diameter via Poincaré and Log-Sobolev inequalities, and they quantify discretization error in $W_1$ for the practical algorithm. The practical instantiation, r2SGLD, uses mini-batch gradients, reflection, and a corrected swapping term to accelerate mixing across multimodal landscapes while avoiding boundary leakage. Empirically, r2SGLD improves Lorenz system identification under physical constraints, robustly captures all modes in constrained multimodal distributions, and enables larger learning rates with improved uncertainty estimates in CIFAR-100 classification. Overall, the paper demonstrates that constrained exploration, enabled by reflection and replica exchange, yields substantial efficiency gains and broader applicability to dynamical systems and deep learning tasks.
Abstract
Replica exchange stochastic gradient Langevin dynamics (reSGLD) is an effective sampler for non-convex learning in large-scale datasets. However, the simulation may encounter stagnation issues when the high-temperature chain delves too deeply into the distribution tails. To tackle this issue, we propose reflected reSGLD (r2SGLD): an algorithm tailored for constrained non-convex exploration by utilizing reflection steps within a bounded domain. Theoretically, we observe that reducing the diameter of the domain enhances mixing rates, exhibiting a $\textit{quadratic}$ behavior. Empirically, we test its performance through extensive experiments, including identifying dynamical systems with physical constraints, simulations of constrained multi-modal distributions, and image classification tasks. The theoretical and empirical findings highlight the crucial role of constrained exploration in improving the simulation efficiency.
