Information-Geometric Barycenters for Bayesian Federated Learning
Nour Jamoussi, Giuseppe Serra, Photios A. Stavrou, Marios Kountouris
TL;DR
This work reframes federated learning aggregation as computing a barycenter of local posteriors on a statistical manifold, enabling an information-geometric perspective that unifies existing Bayesian aggregation methods. It introduces BA-BFL, which preserves the convergence properties of FedAvg while allowing two analytically tractable Gaussian barycenters: the reverse KL (RKL) barycenter and the squared-Wasserstein (W2) barycenter, with explicit closed-form updates for diagonal covariances. The method is extended to Hybrid Bayesian Deep Learning, showing robustness across heterogeneous data and enabling tunable uncertainty quantification via Bayesian layers; experiments on FashionMNIST, SVHN, and CIFAR-10 demonstrate competitive accuracy and improved calibration, while analyzing the trade-offs between Bayesian complexity and computation. The work lays groundwork for broader divergence choices and nonparametric extensions, and suggests personalization via clustering on the posterior manifold to further enhance performance in distributed, heterogeneous environments.
Abstract
Federated learning (FL) is a widely used and impactful distributed optimization framework that achieves consensus through averaging locally trained models. While effective, this approach may not align well with Bayesian inference, where the model space has the structure of a distribution space. Taking an information-geometric perspective, we reinterpret FL aggregation as the problem of finding the barycenter of local posteriors using a prespecified divergence metric, minimizing the average discrepancy across clients. This perspective provides a unifying framework that generalizes many existing methods and offers crisp insights into their theoretical underpinnings. We then propose BA-BFL, an algorithm that retains the convergence properties of Federated Averaging in non-convex settings. In non-independent and identically distributed scenarios, we conduct extensive comparisons with statistical aggregation techniques, showing that BA-BFL achieves performance comparable to state-of-the-art methods while offering a geometric interpretation of the aggregation phase. Additionally, we extend our analysis to Hybrid Bayesian Deep Learning, exploring the impact of Bayesian layers on uncertainty quantification and model calibration.
