DIGing--SGLD: Decentralized and Scalable Langevin Sampling over Time--Varying Networks
Waheed U. Bajwa, Mert Gurbuzbalaban, Mustafa Ali Kutbay, Lingjiong Zhu, Muhammad Zulqarnain
TL;DR
DIGing-SGLD addresses decentralized Bayesian posterior sampling over time-varying networks by integrating stochastic gradient Langevin dynamics with gradient-tracking inspired by DIGing. Under standard strong convexity and smoothness assumptions, it delivers finite-time, non-asymptotic $\mathcal{W}_2$ guarantees to an $O(\sqrt{\eta})$ neighborhood of the Gibbs distribution, with an explicit iteration complexity of $K = O\big(\log(1/\epsilon)/\epsilon^{2}\big)$ when the stepsize is set $\eta = O(\epsilon^{2})$. The method operates without a central coordinator and accommodates time-varying connectivity, achieving convergence rates matching those of centralized and static-graph SGLD despite network drift and gradient noise. Numerical experiments on Bayesian linear and logistic regression validate the theory, showing robust performance and reduced sampling bias under dynamic network conditions. Overall, the paper provides the first non-asymptotic, explicit-constant guarantees for decentralized SGLD on time-varying graphs and demonstrates practical viability for scalable, coordinator-free Bayesian inference in evolving networks.
Abstract
Sampling from a target distribution induced by training data is central to Bayesian learning, with Stochastic Gradient Langevin Dynamics (SGLD) serving as a key tool for scalable posterior sampling and decentralized variants enabling learning when data are distributed across a network of agents. This paper introduces DIGing-SGLD, a decentralized SGLD algorithm designed for scalable Bayesian learning in multi-agent systems operating over time-varying networks. Existing decentralized SGLD methods are restricted to static network topologies, and many exhibit steady-state sampling bias caused by network effects, even when full batches are used. DIGing-SGLD overcomes these limitations by integrating Langevin-based sampling with the gradient-tracking mechanism of the DIGing algorithm, originally developed for decentralized optimization over time-varying networks, thereby enabling efficient and bias-free sampling without a central coordinator. To our knowledge, we provide the first finite-time non-asymptotic Wasserstein convergence guarantees for decentralized SGLD-based sampling over time-varying networks, with explicit constants. Under standard strong convexity and smoothness assumptions, DIGing-SGLD achieves geometric convergence to an $O(\sqrtη)$ neighborhood of the target distribution, where $η$ is the stepsize, with dependence on the target accuracy matching the best-known rates for centralized and static-network SGLD algorithms using constant stepsize. Numerical experiments on Bayesian linear and logistic regression validate the theoretical results and demonstrate the strong empirical performance of DIGing-SGLD under dynamically evolving network conditions.
