Novel clustered federated learning based on local loss
Endong Gu, Yongxin Chen, Hao Wen, Xingju Cai, Deren Han
TL;DR
This work tackles clustering in federated learning under strict privacy constraints, addressing non-IID data by introducing a loss-based clustering metric LCFL that does not rely on sharing gradients or raw data. The framework defines a distance between clients based on local losses, provides theoretical bounds linking this distance to distributional discrepancy, and supports flexible clustering methods with a warm-up phase that precedes FL training. Empirical results on FEMNIST, Rotated MNIST, and Rotated CIFAR10 show LCFL outperforms gradient- and parameter-based clustering approaches and standard FedAvg, especially when client data exhibit clustering structure. The approach preserves privacy, accommodates non-convex models, and offers practical improvements for personalized, distributed learning in real-world FL deployments.
Abstract
This paper proposes LCFL, a novel clustering metric for evaluating clients' data distributions in federated learning. LCFL aligns with federated learning requirements, accurately assessing client-to-client variations in data distribution. It offers advantages over existing clustered federated learning methods, addressing privacy concerns, improving applicability to non-convex models, and providing more accurate classification results. LCFL does not require prior knowledge of clients' data distributions. We provide a rigorous mathematical analysis, demonstrating the correctness and feasibility of our framework. Numerical experiments with neural network instances highlight the superior performance of LCFL over baselines on several clustered federated learning benchmarks.
