Table of Contents
Fetching ...

Client2Vec: Improving Federated Learning by Distribution Shifts Aware Client Indexing

Yongxin Guo, Lin Wang, Xiaoying Tang, Tao Lin

TL;DR

Federated Learning suffers from client distribution shifts, and existing fixes mostly focus on training-time adjustments. This work introduces Client2Vec, which precomputes a per-client index $oldsymbol{eta}_i$ that encodes label and feature distribution shifts using CLIP embeddings $( extbf{D}_{i,j}, extbf{L}_{i,j})$ and the Distribution Shifts Aware Generation Network (DSA-IGN). The approach enables three complementary case studies—improved client sampling, model aggregation, and local training—and consistently yields significant gains across Shakespeare, CIFAR10, and DomainNet. The method decouples index generation from FL training, supports integration with existing FL methods, and offers practical runtime benefits with modest overhead, making distribution-shift aware FL more effective in real-world settings.

Abstract

Federated Learning (FL) is a privacy-preserving distributed machine learning paradigm. Nonetheless, the substantial distribution shifts among clients pose a considerable challenge to the performance of current FL algorithms. To mitigate this challenge, various methods have been proposed to enhance the FL training process. This paper endeavors to tackle the issue of data heterogeneity from another perspective -- by improving FL algorithms prior to the actual training stage. Specifically, we introduce the Client2Vec mechanism, which generates a unique client index for each client before the commencement of FL training. Subsequently, we leverage the generated client index to enhance the subsequent FL training process. To demonstrate the effectiveness of the proposed Client2Vec method, we conduct three case studies that assess the impact of the client index on the FL training process. These case studies encompass enhanced client sampling, model aggregation, and local training. Extensive experiments conducted on diverse datasets and model architectures show the efficacy of Client2Vec across all three case studies. Our code is avaliable at \url{https://github.com/LINs-lab/client2vec}.

Client2Vec: Improving Federated Learning by Distribution Shifts Aware Client Indexing

TL;DR

Federated Learning suffers from client distribution shifts, and existing fixes mostly focus on training-time adjustments. This work introduces Client2Vec, which precomputes a per-client index that encodes label and feature distribution shifts using CLIP embeddings and the Distribution Shifts Aware Generation Network (DSA-IGN). The approach enables three complementary case studies—improved client sampling, model aggregation, and local training—and consistently yields significant gains across Shakespeare, CIFAR10, and DomainNet. The method decouples index generation from FL training, supports integration with existing FL methods, and offers practical runtime benefits with modest overhead, making distribution-shift aware FL more effective in real-world settings.

Abstract

Federated Learning (FL) is a privacy-preserving distributed machine learning paradigm. Nonetheless, the substantial distribution shifts among clients pose a considerable challenge to the performance of current FL algorithms. To mitigate this challenge, various methods have been proposed to enhance the FL training process. This paper endeavors to tackle the issue of data heterogeneity from another perspective -- by improving FL algorithms prior to the actual training stage. Specifically, we introduce the Client2Vec mechanism, which generates a unique client index for each client before the commencement of FL training. Subsequently, we leverage the generated client index to enhance the subsequent FL training process. To demonstrate the effectiveness of the proposed Client2Vec method, we conduct three case studies that assess the impact of the client index on the FL training process. These case studies encompass enhanced client sampling, model aggregation, and local training. Extensive experiments conducted on diverse datasets and model architectures show the efficacy of Client2Vec across all three case studies. Our code is avaliable at \url{https://github.com/LINs-lab/client2vec}.
Paper Structure (50 sections, 1 theorem, 12 equations, 9 figures, 7 tables)

This paper contains 50 sections, 1 theorem, 12 equations, 9 figures, 7 tables.

Key Result

Theorem A.1

Define the following objective function where $p_{i,g}^{t}$ is the aggregation weights on communication round $t$, $S$ is the similarity function, and $q_i^{t}$ is a prior distribution. Solving this optimization problem, the optimal $p_{i,g}^{t}$ is given by

Figures (9)

  • Figure 1: Illustration of the DSA-IGN Workflow: The local data from clients, denoted as $(\mathbf{x}_{i,j}, y_{i,j})$, undergo encoding by the CLIP encoders, resulting in the transformation to $(\mathbf{D}_{i,j}, \mathbf{L}_{i,j})$ before the index generation process. The CLIP image embedding $\mathbf{D}_{i,j}$ is then split into a data encoding $\mathbf{z}_{i,j}$ and a sample feature index $\mathbf{u}_{i,j}^{f}$. The $\mathbf{z}_{i,j}$ and $\mathbf{u}_{i,j}^{f}$ are then concatenated and projected to $\tilde{\mathbf{D}}_{i,j}$ to reconstruct $\mathbf{D}_{i,j}$. Lastly, client label index $\boldsymbol{\beta}_i^{l}$ and client feature index $\boldsymbol{\beta}_i^{f}$ are obtained by averaging $\mathbf{L}_{i,j}$ and $\mathbf{u}_{i,j}$, respectively.
  • Figure 2: Illustration of feature index similarities between different domains. We present an analysis of cos-similarities across various domains. The results are acquired employing the Global training strategy.
  • Figure 3: Visualization of index similarities between clients. We illustrate the similarities of client index $\boldsymbol{\beta}_i$ and client feature index $\boldsymbol{\beta}_{i}^{f}$ between clients. Results including both Global and Federated training strategies are reported. Ideally, clients in the same domain should share a similar client index, resulting in dark diagonal blocks.
  • Figure 4: Workflow of the improved local training (case study 3). The projection layer is to project client features to the same dimension with client feature index $\boldsymbol{\beta}_{i}^{f}$, and the projection classifier is to ensure the projected features and the original client features contain similar information.
  • Figure 5: Ablation studies on the number of training epochs and improved model aggregation for Client2Vec. 'Original' represents the algorithms in their original form, without enhancements, while other results consider all three case studies with varying epoch numbers. The Figures \ref{['fig:ablation-aggregation-cifar']} and \ref{['fig:ablation-aggregation-domain']} utilize client indices generated by the Global strategies.
  • ...and 4 more figures

Theorems & Definitions (5)

  • Definition 3.1: Sample Index
  • Remark 3.2
  • Theorem A.1: Aggregation weights
  • proof
  • Definition C.1: Domain Index