Table of Contents
Fetching ...

Bayesian Coreset Optimization for Personalized Federated Learning

Prateek Chanda, Shrey Modi, Ganesh Ramakrishnan

TL;DR

This work presents CoreSet-PFedBayes, a personalized federated learning framework that optimizes per-client coresets via Bayesian coreset weights to train on compact, representative data rather than full local datasets. By integrating coreset weighting into the FL objective and adding a likelihood-consistency term, the method maintains alignment with full-data distributions while achieving strong generalization bounds, shown to be minimax-optimal up to logarithmic factors. The approach uses Accelerated Iterative Hard Thresholding (AIHT) to efficiently learn sparse per-client coresets and delivers consistent accuracy gains over random-sample baselines and submodular subset selectors on benchmarks like MNIST, Fashion-MNIST, CIFAR-10, and several medical datasets. Theoretical results quantify the generalization-error trade-offs with coresets, and experiments demonstrate reduced communication rounds and data requirements without sacrificing performance, highlighting the method’s practical impact for privacy-aware, heterogeneous FL.

Abstract

In a distributed machine learning setting like Federated Learning where there are multiple clients involved which update their individual weights to a single central server, often training on the entire individual client's dataset for each client becomes cumbersome. To address this issue we propose $\methodprop$: a personalized coreset weighted federated learning setup where the training updates for each individual clients are forwarded to the central server based on only individual client coreset based representative data points instead of the entire client data. Through theoretical analysis we present how the average generalization error is minimax optimal up to logarithm bounds (upper bounded by $\mathcal{O}(n_k^{-\frac{2 β}{2 β+\boldsymbolΛ}} \log ^{2 δ^{\prime}}(n_k))$) and lower bounds of $\mathcal{O}(n_k^{-\frac{2 β}{2 β+\boldsymbolΛ}})$, and how the overall generalization error on the data likelihood differs from a vanilla Federated Learning setup as a closed form function ${\boldsymbol{\Im}}(\boldsymbol{w}, n_k)$ of the coreset weights $\boldsymbol{w}$ and coreset sample size $n_k$. Our experiments on different benchmark datasets based on a variety of recent personalized federated learning architectures show significant gains as compared to random sampling on the training data followed by federated learning, thereby indicating how intelligently selecting such training samples can help in performance. Additionally, through experiments on medical datasets our proposed method showcases some gains as compared to other submodular optimization based approaches used for subset selection on client's data.

Bayesian Coreset Optimization for Personalized Federated Learning

TL;DR

This work presents CoreSet-PFedBayes, a personalized federated learning framework that optimizes per-client coresets via Bayesian coreset weights to train on compact, representative data rather than full local datasets. By integrating coreset weighting into the FL objective and adding a likelihood-consistency term, the method maintains alignment with full-data distributions while achieving strong generalization bounds, shown to be minimax-optimal up to logarithmic factors. The approach uses Accelerated Iterative Hard Thresholding (AIHT) to efficiently learn sparse per-client coresets and delivers consistent accuracy gains over random-sample baselines and submodular subset selectors on benchmarks like MNIST, Fashion-MNIST, CIFAR-10, and several medical datasets. Theoretical results quantify the generalization-error trade-offs with coresets, and experiments demonstrate reduced communication rounds and data requirements without sacrificing performance, highlighting the method’s practical impact for privacy-aware, heterogeneous FL.

Abstract

In a distributed machine learning setting like Federated Learning where there are multiple clients involved which update their individual weights to a single central server, often training on the entire individual client's dataset for each client becomes cumbersome. To address this issue we propose : a personalized coreset weighted federated learning setup where the training updates for each individual clients are forwarded to the central server based on only individual client coreset based representative data points instead of the entire client data. Through theoretical analysis we present how the average generalization error is minimax optimal up to logarithm bounds (upper bounded by ) and lower bounds of , and how the overall generalization error on the data likelihood differs from a vanilla Federated Learning setup as a closed form function of the coreset weights and coreset sample size . Our experiments on different benchmark datasets based on a variety of recent personalized federated learning architectures show significant gains as compared to random sampling on the training data followed by federated learning, thereby indicating how intelligently selecting such training samples can help in performance. Additionally, through experiments on medical datasets our proposed method showcases some gains as compared to other submodular optimization based approaches used for subset selection on client's data.

Paper Structure

This paper contains 29 sections, 13 theorems, 79 equations, 6 figures, 3 tables.

Key Result

Theorem 1

The difference in the upper bound incurred in the overall generalization error of $\textsc{CoreSet-PFedBayes}$ as compared w.r.t that of $\textsc{PFedBayes}$ is always upper bounded by a closed form positive function that depends on the coreset weights and coreset size- $\boldsymbol{\Im}(\boldsymbol

Figures (6)

  • Figure 1: System Diagram : Coreset Weighted Personalized Federated Learning model with parameters under Gaussian assumptions. Each client uploads its updated distribution to the server based its corresponding coreset training data (each client $i$'s data $\boldsymbol{\mathcal{D}^i}$ is weighted by $\boldsymbol{w}_i$) and then the aggregrated global distribution is utilised from the server.
  • Figure 2: Experiments on Bayesian reimann linear function regression for different settings of coreset size =220,260,300 constructed by Accelerated IHT II. Coreset points are presented as black dots, with their radius indicating assigned weights. Extreme right showcases the true posterior distribution
  • Figure 3: KL Divergence Plot over Number of Epochs (MNIST Dataset)
  • Figure 4: Ablation Study on using KL divergence between two local distribution w.r.t just using coreset weights
  • Figure 5: Communication Rounds across Different Sample Size - Convergence analysis
  • ...and 1 more figures

Theorems & Definitions (28)

  • Definition 1
  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Theorem 4
  • Definition 2
  • Definition 3
  • Lemma 1
  • proof
  • Theorem 1
  • ...and 18 more