Table of Contents
Fetching ...

Socially inspired Adaptive Coalition and Client Selection in Federated Learning

Alessandro Licciardi, Roberta Raineri, Anton Proskurnikov, Lamberto Rondoni, Lorenzo Zino

TL;DR

This work tackles non-IID data heterogeneity in federated learning by introducing FedCVR-Bolt, a two-stage, socially inspired client-sampling framework that first forms coalitions of similar clients via homophily-based spectral clustering and then selects one representative per coalition to maximize variance reduction of the global update. The method uses Boltzmann exploration to balance exploration and exploitation and online covariance estimation to adaptively refine cluster structure, with convergence guarantees under standard FL assumptions. Theoretical results quantify variance-reduction-based sampling and provide a convergence neighborhood bound, while experiments on synthetic and real datasets demonstrate improved accuracy and faster convergence compared to strong baselines. The approach offers a principled, scalable mechanism to mitigate heterogeneity in FL, with practical implications for more reliable and efficient distributed learning.

Abstract

Federated Learning (FL) enables privacy-preserving collaborative model training, but its effectiveness is often limited by client data heterogeneity. We introduce a client-selection algorithm that (i) dynamically forms nonoverlapping coalitions of clients based on asymptotic agreement and (ii) selects one representative from each coalition to minimize the variance of model updates. Our approach is inspired by social-network modeling, leveraging homophily-based proximity matrices for spectral clustering and techniques for identifying the most informative individuals to estimate a group's aggregate opinion. We provide theoretical convergence guarantees for the algorithm under mild, standard FL assumptions. Finally, we validate our approach by benchmarking it against three strong heterogeneity-aware baselines; the results show higher accuracy and faster convergence, indicating that the framework is both theoretically grounded and effective in practice.

Socially inspired Adaptive Coalition and Client Selection in Federated Learning

TL;DR

This work tackles non-IID data heterogeneity in federated learning by introducing FedCVR-Bolt, a two-stage, socially inspired client-sampling framework that first forms coalitions of similar clients via homophily-based spectral clustering and then selects one representative per coalition to maximize variance reduction of the global update. The method uses Boltzmann exploration to balance exploration and exploitation and online covariance estimation to adaptively refine cluster structure, with convergence guarantees under standard FL assumptions. Theoretical results quantify variance-reduction-based sampling and provide a convergence neighborhood bound, while experiments on synthetic and real datasets demonstrate improved accuracy and faster convergence compared to strong baselines. The approach offers a principled, scalable mechanism to mitigate heterogeneity in FL, with practical implications for more reliable and efficient distributed learning.

Abstract

Federated Learning (FL) enables privacy-preserving collaborative model training, but its effectiveness is often limited by client data heterogeneity. We introduce a client-selection algorithm that (i) dynamically forms nonoverlapping coalitions of clients based on asymptotic agreement and (ii) selects one representative from each coalition to minimize the variance of model updates. Our approach is inspired by social-network modeling, leveraging homophily-based proximity matrices for spectral clustering and techniques for identifying the most informative individuals to estimate a group's aggregate opinion. We provide theoretical convergence guarantees for the algorithm under mild, standard FL assumptions. Finally, we validate our approach by benchmarking it against three strong heterogeneity-aware baselines; the results show higher accuracy and faster convergence, indicating that the framework is both theoretically grounded and effective in practice.

Paper Structure

This paper contains 37 sections, 9 theorems, 42 equations, 4 tables, 1 algorithm.

Key Result

Lemma 1

Consider $\theta^d \in \mathbb{R}^K$ the vector containing all the $d$-th model components of the federation and $\theta^d_{gl} = \alpha^{\top}\theta^d$ the global model to estimate. Given $\mathcal{A} \subseteq K$, it holdsHere, with some abuse of notation, $\mathbb E[\theta_{gl}^d|\theta_{\mathcal

Theorems & Definitions (15)

  • Lemma 1
  • Proposition 1
  • Corollary 1
  • Proposition 2
  • Corollary 2
  • Remark A1
  • Remark A2
  • Lemma A1
  • proof
  • Proposition A1
  • ...and 5 more