Table of Contents
Fetching ...

One-Shot Clustering for Federated Learning

Maciej Krzysztof Zuziak, Roberto Pellungrini, Salvatore Rinzivillo

TL;DR

OCFL tackles the challenge of when to cluster clients in clustered federated learning by introducing a clustering-agnostic, one-shot scheme that activates clustering early in training based on a Clustering Temperature derived from gradient cosine similarities. The method defines a divergence matrix $\boldsymbol{\Gamma}$ and computes $T(\boldsymbol{\Gamma}) = \frac{||\boldsymbol{\Gamma}||_p}{\lambda}$ with $\lambda = (n(n-1)2^p)^{1/p}$ to signal convergence and trigger clustering. Through formal data-generating processes and experiments on MNIST, FMNIST, and CIFAR10, OCFL paired with density-based clustering (e.g., HDBSCAN, Mean-Shift) achieves high clustering quality (RAND ~ 0.95–0.98) and strong personalization while preserving generalization, outperforming several baselines. The work demonstrates practical benefits for cross-silo FL by enabling automatic, early CFL with minimal hyperparameter tuning, and it outlines future directions for privacy considerations and dynamic client environments.

Abstract

Federated Learning (FL) is a widespread and well adopted paradigm of decentralized learning that allows training one model from multiple sources without the need to directly transfer data between participating clients. Since its inception in 2015, it has been divided into numerous sub-fields that deal with application-specific issues, be it data heterogeneity or resource allocation. One such sub-field, Clustered Federated Learning (CFL), is dealing with the problem of clustering the population of clients into separate cohorts to deliver personalized models. Although few remarkable works have been published in this domain, the problem is still largely unexplored, as its basic assumption and settings are slightly different from standard FL. In this work, we present One-Shot Clustered Federated Learning (OCFL), a clustering-agnostic algorithm that can automatically detect the earliest suitable moment for clustering. Our algorithm is based on the computation of cosine similarity between gradients of the clients and a temperature measure that detects when the federated model starts to converge. We empirically evaluate our methodology by testing various one-shot clustering algorithms for over thirty different tasks on three benchmark datasets. Our experiments showcase the good performance of our approach when used to perform CFL in an automated manner without the need to adjust hyperparameters.

One-Shot Clustering for Federated Learning

TL;DR

OCFL tackles the challenge of when to cluster clients in clustered federated learning by introducing a clustering-agnostic, one-shot scheme that activates clustering early in training based on a Clustering Temperature derived from gradient cosine similarities. The method defines a divergence matrix and computes with to signal convergence and trigger clustering. Through formal data-generating processes and experiments on MNIST, FMNIST, and CIFAR10, OCFL paired with density-based clustering (e.g., HDBSCAN, Mean-Shift) achieves high clustering quality (RAND ~ 0.95–0.98) and strong personalization while preserving generalization, outperforming several baselines. The work demonstrates practical benefits for cross-silo FL by enabling automatic, early CFL with minimal hyperparameter tuning, and it outlines future directions for privacy considerations and dynamic client environments.

Abstract

Federated Learning (FL) is a widespread and well adopted paradigm of decentralized learning that allows training one model from multiple sources without the need to directly transfer data between participating clients. Since its inception in 2015, it has been divided into numerous sub-fields that deal with application-specific issues, be it data heterogeneity or resource allocation. One such sub-field, Clustered Federated Learning (CFL), is dealing with the problem of clustering the population of clients into separate cohorts to deliver personalized models. Although few remarkable works have been published in this domain, the problem is still largely unexplored, as its basic assumption and settings are slightly different from standard FL. In this work, we present One-Shot Clustered Federated Learning (OCFL), a clustering-agnostic algorithm that can automatically detect the earliest suitable moment for clustering. Our algorithm is based on the computation of cosine similarity between gradients of the clients and a temperature measure that detects when the federated model starts to converge. We empirically evaluate our methodology by testing various one-shot clustering algorithms for over thirty different tasks on three benchmark datasets. Our experiments showcase the good performance of our approach when used to perform CFL in an automated manner without the need to adjust hyperparameters.

Paper Structure

This paper contains 19 sections, 5 equations, 3 figures, 1 table, 1 algorithm.

Figures (3)

  • Figure 1: Exemplary figure presenting a continuous clustering assessment. Iterations are plotted on the x-axis, while the Adjusted Rand Score (RAND) is plotted on the y-axis. The simulation presents only a single run for the FMNIST task on a nonoverlapping balanced setting for 15 clients. The aggregated values for each possible combination of task, split and number of clients are presented in Table \ref{['tab:experiment_EC: aggregated_clustering']}
  • Figure 2: Aggregated performance of models in terms of F1-score. In the plot, the lower part of each segment indicates generalized F1-score (GF1), while the upper part indicates personalized F1-score (PF1). The length of the segment represents the learning gap as defined in Equation \ref{['eq: Dist']}
  • Figure 3: Behaviour of the temperature function as defined in the Equation \ref{['eq: Temperature']}. A solid blue line symbolizes the mean across all the scenarios (overlapping and nonoverlapping with balanced and imbalanced settings). Blue shadow represents confidence intervals established based on the variations between different scenarios.