Calibrated One Round Federated Learning with Bayesian Inference in the Predictive Space

Mohsin Hasan; Guojun Zhang; Kaiyang Guo; Xi Chen; Pascal Poupart

Calibrated One Round Federated Learning with Bayesian Inference in the Predictive Space

Mohsin Hasan, Guojun Zhang, Kaiyang Guo, Xi Chen, Pascal Poupart

TL;DR

Federated Learning faces calibration challenges when client data are heterogeneous. The authors show that Bayesian Committee Machine (BCM) can be overconfident in aggregated predictions and introduce β-Predictive Bayes, which interpolates between the BCM product and a predictive mixture using a tunable parameter $β$, followed by distillation to a single deployable model. The method learns $β$ by optimizing a negative log-likelihood on a server dataset and demonstrates improved calibration (lower NLL and ECE) on both classification and regression tasks in a single communication round. The work provides theoretical calibration analysis and extensive empirical results, highlighting improved uncertainty estimates in FL with limited communication and heterogeneous data. This approach enables more reliable probabilistic predictions in practical FL deployments.

Abstract

Federated Learning (FL) involves training a model over a dataset distributed among clients, with the constraint that each client's dataset is localized and possibly heterogeneous. In FL, small and noisy datasets are common, highlighting the need for well-calibrated models that represent the uncertainty of predictions. The closest FL techniques to achieving such goals are the Bayesian FL methods which collect parameter samples from local posteriors, and aggregate them to approximate the global posterior. To improve scalability for larger models, one common Bayesian approach is to approximate the global predictive posterior by multiplying local predictive posteriors. In this work, we demonstrate that this method gives systematically overconfident predictions, and we remedy this by proposing $β$-Predictive Bayes, a Bayesian FL algorithm that interpolates between a mixture and product of the predictive posteriors, using a tunable parameter $β$. This parameter is tuned to improve the global ensemble's calibration, before it is distilled to a single model. Our method is evaluated on a variety of regression and classification datasets to demonstrate its superiority in calibration to other baselines, even as data heterogeneity increases. Code available at https://github.com/hasanmohsin/betaPredBayesFL

Calibrated One Round Federated Learning with Bayesian Inference in the Predictive Space

TL;DR

, followed by distillation to a single deployable model. The method learns

by optimizing a negative log-likelihood on a server dataset and demonstrates improved calibration (lower NLL and ECE) on both classification and regression tasks in a single communication round. The work provides theoretical calibration analysis and extensive empirical results, highlighting improved uncertainty estimates in FL with limited communication and heterogeneous data. This approach enables more reliable probabilistic predictions in practical FL deployments.

Abstract

-Predictive Bayes, a Bayesian FL algorithm that interpolates between a mixture and product of the predictive posteriors, using a tunable parameter

. This parameter is tuned to improve the global ensemble's calibration, before it is distilled to a single model. Our method is evaluated on a variety of regression and classification datasets to demonstrate its superiority in calibration to other baselines, even as data heterogeneity increases. Code available at https://github.com/hasanmohsin/betaPredBayesFL

Paper Structure (32 sections, 20 theorems, 28 equations, 2 figures, 6 tables, 1 algorithm)

This paper contains 32 sections, 20 theorems, 28 equations, 2 figures, 6 tables, 1 algorithm.

Introduction
Background
Bayesian Learning
Knowledge Distillation
Related Work
Bayesian Techniques in FL
One-Shot Federated Learning
Analysis of BCM Calibration
Analyzing the Predictive Mixture Model
Calibration Analysis for Classification
Calibrating the Aggregated Model
Experiments
Classification Dataset Setup
Classification Results
Regression Dataset Setup
...and 17 more sections

Key Result

Lemma 1

Assume $x^* \in R$. Under some mild conditions on the kernel function, and under the assumption of Gaussian or Laplacian observation noise, as the number of data-points increases $\sigma^2(x^*) \to \sigma^2_o$ (and in addition, the predictive mean converges to the true function value: $\mu(x^*) \to

Figures (2)

Figure 1: NLL on the classification datasets with increasing heterogeneity (tested with $h\in${0.0, 0.3, 0.6, 0.9}). Averages and standard error over 10 seeds are reported. Omitted values (e.g., for FedPA on EMNIST) denote results where NLL diverged.
Figure 2: ECE on the classification datasets with increasing heterogeneity (tested with $h\in${0.0, 0.3, 0.6, 0.9}). Averages and standard error over 10 seeds are reported.

Theorems & Definitions (29)

Lemma 1: choiGPconsistency
Lemma 2
Theorem 1: BCM, homogeneous
Theorem 2: BCM, heterogeneous
Theorem 3: mixture model, homogeneous
Theorem 4: mixture model, heterogeneous
Theorem 5: BCM, homogeneous, classification
Theorem 6: BCM, heterogeneous, classification
Theorem 7: mixture model, heterogeneous, classification
Theorem 8: mixture model, homogeneous, classification
...and 19 more

Calibrated One Round Federated Learning with Bayesian Inference in the Predictive Space

TL;DR

Abstract

Calibrated One Round Federated Learning with Bayesian Inference in the Predictive Space

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (29)