Client2Vec: Improving Federated Learning by Distribution Shifts Aware Client Indexing
Yongxin Guo, Lin Wang, Xiaoying Tang, Tao Lin
TL;DR
Federated Learning suffers from client distribution shifts, and existing fixes mostly focus on training-time adjustments. This work introduces Client2Vec, which precomputes a per-client index $oldsymbol{eta}_i$ that encodes label and feature distribution shifts using CLIP embeddings $( extbf{D}_{i,j}, extbf{L}_{i,j})$ and the Distribution Shifts Aware Generation Network (DSA-IGN). The approach enables three complementary case studies—improved client sampling, model aggregation, and local training—and consistently yields significant gains across Shakespeare, CIFAR10, and DomainNet. The method decouples index generation from FL training, supports integration with existing FL methods, and offers practical runtime benefits with modest overhead, making distribution-shift aware FL more effective in real-world settings.
Abstract
Federated Learning (FL) is a privacy-preserving distributed machine learning paradigm. Nonetheless, the substantial distribution shifts among clients pose a considerable challenge to the performance of current FL algorithms. To mitigate this challenge, various methods have been proposed to enhance the FL training process. This paper endeavors to tackle the issue of data heterogeneity from another perspective -- by improving FL algorithms prior to the actual training stage. Specifically, we introduce the Client2Vec mechanism, which generates a unique client index for each client before the commencement of FL training. Subsequently, we leverage the generated client index to enhance the subsequent FL training process. To demonstrate the effectiveness of the proposed Client2Vec method, we conduct three case studies that assess the impact of the client index on the FL training process. These case studies encompass enhanced client sampling, model aggregation, and local training. Extensive experiments conducted on diverse datasets and model architectures show the efficacy of Client2Vec across all three case studies. Our code is avaliable at \url{https://github.com/LINs-lab/client2vec}.
