Asynchronous Local Computations in Distributed Bayesian Learning
Kinjal Bhar, He Bai, Jemin George, Carl Busart
TL;DR
This work develops an asynchronous gossip-based distributed Bayesian learning framework that uses multiple local Unadjusted Langevin Algorithm (ULA) updates between inter-agent communications to reduce communication overhead. By modeling gossip with a Poisson process and reusing mini-batch gradients within cycles, the method achieves consensus and converges to the Bayesian posterior p^* under Lipschitz and log-Sobolev assumptions, with rates that scale polynomially with the cycle parameter δ_α. Theoretical results establish consensus and KL-divergence convergence, while experiments on a toy Gaussian problem and real datasets (Gamma Telescope and mHealth) demonstrate faster initial convergence and robust classification performance, particularly in low-data regimes. The approach is applicable to both decentralized and federated-like settings, offering practical gains in speed and uncertainty quantification. Key contributions include a formal analysis of local computations per cycle, explicit step-size and local-iteration conditions, and validation of asynchronous gossip-ULA in realistic classification tasks.
Abstract
Due to the expanding scope of machine learning (ML) to the fields of sensor networking, cooperative robotics and many other multi-agent systems, distributed deployment of inference algorithms has received a lot of attention. These algorithms involve collaboratively learning unknown parameters from dispersed data collected by multiple agents. There are two competing aspects in such algorithms, namely, intra-agent computation and inter-agent communication. Traditionally, algorithms are designed to perform both synchronously. However, certain circumstances need frugal use of communication channels as they are either unreliable, time-consuming, or resource-expensive. In this paper, we propose gossip-based asynchronous communication to leverage fast computations and reduce communication overhead simultaneously. We analyze the effects of multiple (local) intra-agent computations by the active agents between successive inter-agent communications. For local computations, Bayesian sampling via unadjusted Langevin algorithm (ULA) MCMC is utilized. The communication is assumed to be over a connected graph (e.g., as in decentralized learning), however, the results can be extended to coordinated communication where there is a central server (e.g., federated learning). We theoretically quantify the convergence rates in the process. To demonstrate the efficacy of the proposed algorithm, we present simulations on a toy problem as well as on real world data sets to train ML models to perform classification tasks. We observe faster initial convergence and improved performance accuracy, especially in the low data range. We achieve on average 78% and over 90% classification accuracy respectively on the Gamma Telescope and mHealth data sets from the UCI ML repository.
