Table of Contents
Fetching ...

Cost-Free Personalization via Information-Geometric Projection in Bayesian Federated Learning

Nour Jamoussi, Giuseppe Serra, Photios A. Stavrou, Marios Kountouris

TL;DR

The paper tackles data heterogeneity and privacy in Federated Learning by introducing cost-free personalization through an information-geometric projection. A global posterior is projected onto a local neighborhood around each client’s posterior, which is equivalent to a $D$-barycenter with weights $w_g=1/( abla+1)$ and $w_k= abla/( abla+1)$ (with $ abla$ linked to a radius parameter $ ho$ via $ ho$). Under convex divergences in the first argument, this projection equals a barycentric solution, enabling closed-form personalization for Gaussian posteriors within a Variational Bayes framework using IVON. Empirically, the method balances global generalization and local specialization with minimal overhead, delivering well-calibrated uncertainty and strong cross-client performance across FashionMNIST, SVHN, and CIFAR-10, while maintaining robustness in non-i.i.d. settings. The approach generalizes beyond parametric BFL to domain adaptation and model merging, highlighting its practical impact for privacy-preserving, uncertainty-aware personalization in heterogeneous distributed learning.

Abstract

Bayesian Federated Learning (BFL) combines uncertainty modeling with decentralized training, enabling the development of personalized and reliable models under data heterogeneity and privacy constraints. Existing approaches typically rely on Markov Chain Monte Carlo (MCMC) sampling or variational inference, often incorporating personalization mechanisms to better adapt to local data distributions. In this work, we propose an information-geometric projection framework for personalization in parametric BFL. By projecting the global model onto a neighborhood of the user's local model, our method enables a tunable trade-off between global generalization and local specialization. Under mild assumptions, we show that this projection step is equivalent to computing a barycenter on the statistical manifold, allowing us to derive closed-form solutions and achieve cost-free personalization. We apply the proposed approach to a variational learning setup using the Improved Variational Online Newton (IVON) optimizer and extend its application to general aggregation schemes in BFL. Empirical evaluations under heterogeneous data distributions confirm that our method effectively balances global and local performance with minimal computational overhead.

Cost-Free Personalization via Information-Geometric Projection in Bayesian Federated Learning

TL;DR

The paper tackles data heterogeneity and privacy in Federated Learning by introducing cost-free personalization through an information-geometric projection. A global posterior is projected onto a local neighborhood around each client’s posterior, which is equivalent to a -barycenter with weights and (with linked to a radius parameter via ). Under convex divergences in the first argument, this projection equals a barycentric solution, enabling closed-form personalization for Gaussian posteriors within a Variational Bayes framework using IVON. Empirically, the method balances global generalization and local specialization with minimal overhead, delivering well-calibrated uncertainty and strong cross-client performance across FashionMNIST, SVHN, and CIFAR-10, while maintaining robustness in non-i.i.d. settings. The approach generalizes beyond parametric BFL to domain adaptation and model merging, highlighting its practical impact for privacy-preserving, uncertainty-aware personalization in heterogeneous distributed learning.

Abstract

Bayesian Federated Learning (BFL) combines uncertainty modeling with decentralized training, enabling the development of personalized and reliable models under data heterogeneity and privacy constraints. Existing approaches typically rely on Markov Chain Monte Carlo (MCMC) sampling or variational inference, often incorporating personalization mechanisms to better adapt to local data distributions. In this work, we propose an information-geometric projection framework for personalization in parametric BFL. By projecting the global model onto a neighborhood of the user's local model, our method enables a tunable trade-off between global generalization and local specialization. Under mild assumptions, we show that this projection step is equivalent to computing a barycenter on the statistical manifold, allowing us to derive closed-form solutions and achieve cost-free personalization. We apply the proposed approach to a variational learning setup using the Improved Variational Online Newton (IVON) optimizer and extend its application to general aggregation schemes in BFL. Empirical evaluations under heterogeneous data distributions confirm that our method effectively balances global and local performance with minimal computational overhead.

Paper Structure

This paper contains 41 sections, 1 theorem, 18 equations, 11 figures, 2 tables.

Key Result

Theorem 1

Under Assumption ass:divergences, the solution of the projection problem prob:proj is equivalent to that of the weighted barycenter problem def:barycenter, i.e., where the weights $w_g$ and $w_k$ are given by for some $\lambda \in [0, \infty)$.

Figures (11)

  • Figure 1: Personalization through information-geometric projection. The figure presents two projection scenarios illustrated with two local spheres $\mathcal{S}_k^1$ and $\mathcal{S}_k^2$ of increasing radius $r_k^1$ and $r_k^2$, highlighting the impact of the radius on the closeness of the projected distribution to the global or local distribution.
  • Figure 2: Effect of $\lambda$ on performance across local and global data distributions. Results are reported for the CIFAR-10 dataset. Notably, $\lambda = 0$ corresponds to the global model, whereas $\lambda \to \infty$ corresponds to the local model.
  • Figure 3: Trade-offs between local and global performance for personalized models. Each subplot presents results for a different evaluation metric: Accuracy (left), ECE (center), and NLL (right). Points represent method–dataset pairs. For Accuracy, the top-right region indicates a better performance trade-off, whereas for ECE and NLL, the bottom-left region is preferable. Our method (with $\lambda = 1$) consistently achieves a favorable balance across all metrics and datasets.
  • Figure 4: Performance of the global model on clients' local data. Top row: average accuracy. Bottom row: worst-case performance. Each column corresponds to a different dataset.
  • Figure 5: Wilcoxon signed-rank test $p$-values comparing aggregation methods across all datasets for three evaluation metrics: (a) accuracy, (b) ECE, and (c) NLL. Lower $p$-values indicate statistically significant differences between methods. Only the lower triangle of each matrix is shown to avoid redundancy.
  • ...and 6 more figures

Theorems & Definitions (4)

  • Definition 1
  • Definition 2
  • Theorem 1
  • proof