Table of Contents
Fetching ...

Enhancing Clustered Federated Learning: Integration of Strategies and Improved Methodologies

Yongxin Guo, Xiaoying Tang, Tao Lin

TL;DR

This work tackles heterogeneity in federated learning by proposing HCFL, a holistic clustered FL framework that unifies existing methods across four tiers of cluster formulation, weight calculation, adaptive clustering, and distance metrics, enabling seamless integration of soft and hard clustering. It formalizes the objective $L(\boldsymbol{\Theta}, \boldsymbol{\Omega})$ and extends it with a four-tier paradigm that recovers methods like CFL, FedEM, and FedRC, while enabling cluster splitting/merging to determine the number of clusters $K$. Building on HCFL, HCFL+ introduces improvements addressing four remaining challenges—consistency of intra-client weights, efficiency, adaptive support for soft clustering, and refined distance metrics—through an inconsistency/efficiency-aware objective, soft-clustering adaptations, and prototype-based distance design, validated on CIFAR10/100 and Tiny-Imagenet with architectures such as MobileNetV2 and ResNet18. The results demonstrate favorable personalization-generalization trade-offs, automatic cluster-number management, and improved robustness under noisy or diverse distribution shifts, highlighting the framework's practical potential for scalable, privacy-preserving FL with heterogeneous clients.

Abstract

Federated Learning (FL) is an evolving distributed machine learning approach that safeguards client privacy by keeping data on edge devices. However, the variation in data among clients poses challenges in training models that excel across all local distributions. Recent studies suggest clustering as a solution to address client heterogeneity in FL by grouping clients with distribution shifts into distinct clusters. Nonetheless, the diverse learning frameworks used in current clustered FL methods create difficulties in integrating these methods, leveraging their advantages, and making further enhancements. To this end, this paper conducts a thorough examination of existing clustered FL methods and introduces a four-tier framework, named HCFL, to encompass and extend the existing approaches. Utilizing the HCFL, we identify persistent challenges associated with current clustering methods in each tier and propose an enhanced clustering method called HCFL$^{+}$ to overcome these challenges. Through extensive numerical evaluations, we demonstrate the effectiveness of our clustering framework and the enhanced components. Our code is available at https://github.com/LINs-lab/HCFL.

Enhancing Clustered Federated Learning: Integration of Strategies and Improved Methodologies

TL;DR

This work tackles heterogeneity in federated learning by proposing HCFL, a holistic clustered FL framework that unifies existing methods across four tiers of cluster formulation, weight calculation, adaptive clustering, and distance metrics, enabling seamless integration of soft and hard clustering. It formalizes the objective and extends it with a four-tier paradigm that recovers methods like CFL, FedEM, and FedRC, while enabling cluster splitting/merging to determine the number of clusters . Building on HCFL, HCFL+ introduces improvements addressing four remaining challenges—consistency of intra-client weights, efficiency, adaptive support for soft clustering, and refined distance metrics—through an inconsistency/efficiency-aware objective, soft-clustering adaptations, and prototype-based distance design, validated on CIFAR10/100 and Tiny-Imagenet with architectures such as MobileNetV2 and ResNet18. The results demonstrate favorable personalization-generalization trade-offs, automatic cluster-number management, and improved robustness under noisy or diverse distribution shifts, highlighting the framework's practical potential for scalable, privacy-preserving FL with heterogeneous clients.

Abstract

Federated Learning (FL) is an evolving distributed machine learning approach that safeguards client privacy by keeping data on edge devices. However, the variation in data among clients poses challenges in training models that excel across all local distributions. Recent studies suggest clustering as a solution to address client heterogeneity in FL by grouping clients with distribution shifts into distinct clusters. Nonetheless, the diverse learning frameworks used in current clustered FL methods create difficulties in integrating these methods, leveraging their advantages, and making further enhancements. To this end, this paper conducts a thorough examination of existing clustered FL methods and introduces a four-tier framework, named HCFL, to encompass and extend the existing approaches. Utilizing the HCFL, we identify persistent challenges associated with current clustering methods in each tier and propose an enhanced clustering method called HCFL to overcome these challenges. Through extensive numerical evaluations, we demonstrate the effectiveness of our clustering framework and the enhanced components. Our code is available at https://github.com/LINs-lab/HCFL.
Paper Structure (40 sections, 11 theorems, 125 equations, 4 figures, 13 tables, 5 algorithms)

This paper contains 40 sections, 11 theorems, 125 equations, 4 figures, 13 tables, 5 algorithms.

Key Result

Theorem A.1

Given objective function $\mathcal{L}(\boldsymbol{\phi}, \boldsymbol{\Theta}, \boldsymbol{\Omega}, \tilde{\boldsymbol{\Omega}})$ and we define $\tilde{\boldsymbol{\Omega}} = \{ \tilde{\omega}_{i;k} | \forall i, k \}$, then $\tilde{\boldsymbol{\Omega}}$ is obtained by Then E-M steps are obtained by maximizing $\mathcal{L}(\boldsymbol{\phi}, \boldsymbol{\Theta}, \boldsymbol{\Omega}, \tilde{\boldsym

Figures (4)

  • Figure 1: Overview of the HCFL. The HCFL encompasses the existing clustered FL algorithms through the design of four tiers, including cluster formulations, which maximize conditional distribution, joint distribution, or variable relationships; cluster weights calculation, including soft clustering and hard clustering; adaptive clustering procedure, including using a predefined number of clusters, automatically adding new clusters, or merge and remove existing clusters; client distance metrics, including using distance on clients' local gradients, clients' local model parameters, or clients' local feature norms. The four tiers collaborate to form a comprehensive clustered FL learning process, as shown in the left part of the figure. For instance, CFL can be described by the A, D, G, and J, while A, E, and F cover FedEM.
  • Figure 2: Remaining challenges in clustered FL methods. We identify four key issues in clustered FL algorithms: (1) inconsistent intra-client clustering weights, (2) efficiency concerns, (3) the absence of adaptive clustering for soft clustering methods, and (4) the lack of fine-grained distance metrics for various clustering principles. Clustering principles ASCP and CSCP differ in their approach as follows: ASCP assigns clients with any shifts into different clusters, while CSCP only assigns clients with concept shifts to different clusters."
  • Figure 3: Ablation studies on Sections \ref{['sec:inconsistency-aware-objective']} and \ref{['sec:cover-soft-cluster']}. For Sec \ref{['sec:inconsistency-aware-objective']}, we evaluated test accuracies of HCFL$^{+}$ using different backbones (FedEM and FedRC) and varying values of $\tilde{\mu}$, as shown in Figures \ref{['fig:fedem-test-mu']} and \ref{['fig:fedrc-test-mu']}. For Sec \ref{['sec:cover-soft-cluster']}, we present the best Val and Test accuracy achieved by HCFL$^{+}$ with either FedEM or FedRC as backbones. "w/ SCWU" indicates the use of soft clustering weight updating mechanisms introduced in Section \ref{['sec:cover-soft-cluster']}. More detailed results can be found in Tables \ref{['tab:ablation-studies-on-tier-1']} and \ref{['tab:ablation-studies-on-tier-2']} in Appendix \ref{['sec:Additional Experiment Results']}.
  • Figure 4: Number of clusters in HCFL$^{+}$ over communication rounds. We illustrate changes in cluster numbers across communication rounds for various $\rho$ values using the CIFAR-10 dataset in our experiments.

Theorems & Definitions (25)

  • Theorem A.1
  • proof
  • Definition B.1
  • Definition B.2: $\left\lVert\mathbf{A}\right\rVert_2$-sub-Gaussian
  • Lemma B.3
  • proof
  • Lemma B.4
  • proof
  • Lemma B.5
  • proof
  • ...and 15 more