Enhancing Clustered Federated Learning: Integration of Strategies and Improved Methodologies
Yongxin Guo, Xiaoying Tang, Tao Lin
TL;DR
This work tackles heterogeneity in federated learning by proposing HCFL, a holistic clustered FL framework that unifies existing methods across four tiers of cluster formulation, weight calculation, adaptive clustering, and distance metrics, enabling seamless integration of soft and hard clustering. It formalizes the objective $L(\boldsymbol{\Theta}, \boldsymbol{\Omega})$ and extends it with a four-tier paradigm that recovers methods like CFL, FedEM, and FedRC, while enabling cluster splitting/merging to determine the number of clusters $K$. Building on HCFL, HCFL+ introduces improvements addressing four remaining challenges—consistency of intra-client weights, efficiency, adaptive support for soft clustering, and refined distance metrics—through an inconsistency/efficiency-aware objective, soft-clustering adaptations, and prototype-based distance design, validated on CIFAR10/100 and Tiny-Imagenet with architectures such as MobileNetV2 and ResNet18. The results demonstrate favorable personalization-generalization trade-offs, automatic cluster-number management, and improved robustness under noisy or diverse distribution shifts, highlighting the framework's practical potential for scalable, privacy-preserving FL with heterogeneous clients.
Abstract
Federated Learning (FL) is an evolving distributed machine learning approach that safeguards client privacy by keeping data on edge devices. However, the variation in data among clients poses challenges in training models that excel across all local distributions. Recent studies suggest clustering as a solution to address client heterogeneity in FL by grouping clients with distribution shifts into distinct clusters. Nonetheless, the diverse learning frameworks used in current clustered FL methods create difficulties in integrating these methods, leveraging their advantages, and making further enhancements. To this end, this paper conducts a thorough examination of existing clustered FL methods and introduces a four-tier framework, named HCFL, to encompass and extend the existing approaches. Utilizing the HCFL, we identify persistent challenges associated with current clustering methods in each tier and propose an enhanced clustering method called HCFL$^{+}$ to overcome these challenges. Through extensive numerical evaluations, we demonstrate the effectiveness of our clustering framework and the enhanced components. Our code is available at https://github.com/LINs-lab/HCFL.
