Table of Contents
Fetching ...

Interaction-Aware Gaussian Weighting for Clustered Federated Learning

Alessandro Licciardi, Davide Leo, Eros Fanì, Barbara Caputo, Marco Ciccone

TL;DR

The paper addresses non-IID data and class imbalance in distributed FL by formulating the objective as $ \min_{\theta} \sum_{k=1}^K \frac{n_k}{n} \mathcal{L}_k(\theta)$ and pursuing cluster-specific models. It introduces FedGWC, which uses a Gaussian reward derived from each client’s loss trajectory to build a similarity signal, constructs an interaction matrix $P^t$ and a symmetric affinity matrix $W$ over unbiased perceptions, and applies spectral clustering to form homogeneous client clusters; a Wasserstein Adjusted Score is developed to quantify cluster cohesion under imbalance. Theoretical contributions prove convergence of Gaussian weights to per-client rewards $\mu_k$ with variance reduction, and the method maintains bounded interaction matrices and sampling-rate stability. Empirical results on Leaf, CIFAR-100, FEMNIST, Google Landmarks, and iNaturalist show improved clustering quality and cluster-wise accuracy, while remaining communication- and computation-efficient and compatible with existing FL aggregators.

Abstract

Federated Learning (FL) emerged as a decentralized paradigm to train models while preserving privacy. However, conventional FL struggles with data heterogeneity and class imbalance, which degrade model performance. Clustered FL balances personalization and decentralized training by grouping clients with analogous data distributions, enabling improved accuracy while adhering to privacy constraints. This approach effectively mitigates the adverse impact of heterogeneity in FL. In this work, we propose a novel clustered FL method, FedGWC (Federated Gaussian Weighting Clustering), which groups clients based on their data distribution, allowing training of a more robust and personalized model on the identified clusters. FedGWC identifies homogeneous clusters by transforming individual empirical losses to model client interactions with a Gaussian reward mechanism. Additionally, we introduce the Wasserstein Adjusted Score, a new clustering metric for FL to evaluate cluster cohesion with respect to the individual class distribution. Our experiments on benchmark datasets show that FedGWC outperforms existing FL algorithms in cluster quality and classification accuracy, validating the efficacy of our approach.

Interaction-Aware Gaussian Weighting for Clustered Federated Learning

TL;DR

The paper addresses non-IID data and class imbalance in distributed FL by formulating the objective as and pursuing cluster-specific models. It introduces FedGWC, which uses a Gaussian reward derived from each client’s loss trajectory to build a similarity signal, constructs an interaction matrix and a symmetric affinity matrix over unbiased perceptions, and applies spectral clustering to form homogeneous client clusters; a Wasserstein Adjusted Score is developed to quantify cluster cohesion under imbalance. Theoretical contributions prove convergence of Gaussian weights to per-client rewards with variance reduction, and the method maintains bounded interaction matrices and sampling-rate stability. Empirical results on Leaf, CIFAR-100, FEMNIST, Google Landmarks, and iNaturalist show improved clustering quality and cluster-wise accuracy, while remaining communication- and computation-efficient and compatible with existing FL aggregators.

Abstract

Federated Learning (FL) emerged as a decentralized paradigm to train models while preserving privacy. However, conventional FL struggles with data heterogeneity and class imbalance, which degrade model performance. Clustered FL balances personalization and decentralized training by grouping clients with analogous data distributions, enabling improved accuracy while adhering to privacy constraints. This approach effectively mitigates the adverse impact of heterogeneity in FL. In this work, we propose a novel clustered FL method, FedGWC (Federated Gaussian Weighting Clustering), which groups clients based on their data distribution, allowing training of a more robust and personalized model on the identified clusters. FedGWC identifies homogeneous clusters by transforming individual empirical losses to model client interactions with a Gaussian reward mechanism. Additionally, we introduce the Wasserstein Adjusted Score, a new clustering metric for FL to evaluate cluster cohesion with respect to the individual class distribution. Our experiments on benchmark datasets show that FedGWC outperforms existing FL algorithms in cluster quality and classification accuracy, validating the efficacy of our approach.

Paper Structure

This paper contains 33 sections, 9 theorems, 45 equations, 8 figures, 9 tables, 2 algorithms.

Key Result

Theorem 5.1

Let $\{\alpha_t\}_{t = 1}^\infty$ be a sequence of positive real values, and $\{\Gamma_k^t\}_{t=1}^\infty$ the sequence of Gaussian weights. If $\{\alpha_t\}_{t = 1}^\infty \in l^2(\mathbb{N})/l^1(\mathbb{N})$, then $\Gamma_k^t$ converges in $L^2$. Furthermore, for $t\to\infty$,

Figures (8)

  • Figure 1: Illustration of the Gaussian reward mechanism for two clients from Cifar100 (Dirichlet $\alpha = 0.05$, 10 sampled clients per round and $S = 8$ local iterations). The dashed line represents the average loss process $m^{t,s}$, with the blue region indicating the confidence interval $m^{t,s} \pm \sigma^{t,s}$ at fixed $t$, $s = 1,\dots, 8$. The green curve corresponds to an in-distribution client, whose loss remains within the confidence region, resulting in a high Gaussian reward. The red line represents an out-of-distribution client, whose loss lies outside the confidence region, resulting in a lower reward.
  • Figure 2: Balanced accuracy on Cifar100 for FedGWC (blue curve) with FedAvg aggregation compared to the clustered FL baselines. FedGWC detects two splits demonstrating significant improvements in accuracy when clustering is performed, leading also to a faster and more stable convergence than baseline algorithms.
  • Figure 3: Cluster evolution with respect to the recursive splits in FedGWC on Cifar100, projected on the spectral embedded bi-dimensional space. From left to right, top to bottom, we can see that FedGWC splits the client into cluster, until a certain level of intra-cluster homogeneity is reached
  • Figure 4: Interaction matrix convergence: on the $y$-axis MSE in logarithmic scale w.r.t. communication rounds in the $x$-axis on Cifar10, with Dirichlet parameter $\alpha = 0.05$.
  • Figure 5: Homogeneous (Cifar10 $\alpha = 100$) vs heterogeneous clustering (Cifar10 $\alpha = 0.05$). The interaction matrix at convergence and the corresponding scaled affinity matrix are on the left. The scatter plot in the 2D plane with spectral embedding is on the right. It is possible to see that the algorithm perfectly separates homogeneous clients (orange) from heterogeneous clients (black)
  • ...and 3 more figures

Theorems & Definitions (17)

  • Theorem 5.1
  • Theorem 5.2
  • Proposition 5.3
  • Theorem 1.1
  • proof
  • Theorem 1.2
  • proof
  • Proposition 1.3
  • proof
  • Proposition 1.4
  • ...and 7 more