Table of Contents
Fetching ...

Federated Learning with Sample-level Client Drift Mitigation

Haoran Xu, Jiaze Li, Wanyi Wu, Hao Ren

TL;DR

This work tackles the non-IID data challenge in Federated Learning by revealing that client drift originates from biases across individual local samples and that these biases vary dynamically during training. It introduces FedBSS, a two-stage approach with a warmup phase for diversified knowledge and a bias-aware progressive learning stage that uses loss-based sample ranking and an uncertainty-derived boundary to partition samples into unbiased and biased groups, gradually incorporating biased samples via a cosine-based schedule. Empirical results across label-skew, feature-skew, and noisy-label scenarios show that FedBSS consistently outperforms state-of-the-art baselines, with pronounced gains on larger models and more heterogeneous data, demonstrating scalability and robustness. The method offers a practical mechanism to mitigate heterogeneity in real-world FL deployments by leveraging fine-grained sample information rather than solely focusing on global-update calibration or server-side aggregation strategies.

Abstract

Federated Learning (FL) suffers from severe performance degradation due to the data heterogeneity among clients. Existing works reveal that the fundamental reason is that data heterogeneity can cause client drift where the local model update deviates from the global one, and thus they usually tackle this problem from the perspective of calibrating the obtained local update. Despite effectiveness, existing methods substantially lack a deep understanding of how heterogeneous data samples contribute to the formation of client drift. In this paper, we bridge this gap by identifying that the drift can be viewed as a cumulative manifestation of biases present in all local samples and the bias between samples is different. Besides, the bias dynamically changes as the FL training progresses. Motivated by this, we propose FedBSS that first mitigates the heterogeneity issue in a sample-level manner, orthogonal to existing methods. Specifically, the core idea of our method is to adopt a bias-aware sample selection scheme that dynamically selects the samples from small biases to large epoch by epoch to train progressively the local model in each round. In order to ensure the stability of training, we set the diversified knowledge acquisition stage as the warm-up stage to avoid the local optimality caused by knowledge deviation in the early stage of the model. Evaluation results show that FedBSS outperforms state-of-the-art baselines. In addition, we also achieved effective results on feature distribution skew and noise label dataset setting, which proves that FedBSS can not only reduce heterogeneity, but also has scalability and robustness.

Federated Learning with Sample-level Client Drift Mitigation

TL;DR

This work tackles the non-IID data challenge in Federated Learning by revealing that client drift originates from biases across individual local samples and that these biases vary dynamically during training. It introduces FedBSS, a two-stage approach with a warmup phase for diversified knowledge and a bias-aware progressive learning stage that uses loss-based sample ranking and an uncertainty-derived boundary to partition samples into unbiased and biased groups, gradually incorporating biased samples via a cosine-based schedule. Empirical results across label-skew, feature-skew, and noisy-label scenarios show that FedBSS consistently outperforms state-of-the-art baselines, with pronounced gains on larger models and more heterogeneous data, demonstrating scalability and robustness. The method offers a practical mechanism to mitigate heterogeneity in real-world FL deployments by leveraging fine-grained sample information rather than solely focusing on global-update calibration or server-side aggregation strategies.

Abstract

Federated Learning (FL) suffers from severe performance degradation due to the data heterogeneity among clients. Existing works reveal that the fundamental reason is that data heterogeneity can cause client drift where the local model update deviates from the global one, and thus they usually tackle this problem from the perspective of calibrating the obtained local update. Despite effectiveness, existing methods substantially lack a deep understanding of how heterogeneous data samples contribute to the formation of client drift. In this paper, we bridge this gap by identifying that the drift can be viewed as a cumulative manifestation of biases present in all local samples and the bias between samples is different. Besides, the bias dynamically changes as the FL training progresses. Motivated by this, we propose FedBSS that first mitigates the heterogeneity issue in a sample-level manner, orthogonal to existing methods. Specifically, the core idea of our method is to adopt a bias-aware sample selection scheme that dynamically selects the samples from small biases to large epoch by epoch to train progressively the local model in each round. In order to ensure the stability of training, we set the diversified knowledge acquisition stage as the warm-up stage to avoid the local optimality caused by knowledge deviation in the early stage of the model. Evaluation results show that FedBSS outperforms state-of-the-art baselines. In addition, we also achieved effective results on feature distribution skew and noise label dataset setting, which proves that FedBSS can not only reduce heterogeneity, but also has scalability and robustness.
Paper Structure (13 sections, 6 equations, 5 figures, 2 tables, 1 algorithm)

This paper contains 13 sections, 6 equations, 5 figures, 2 tables, 1 algorithm.

Figures (5)

  • Figure 1: Illustration of FedBSS. Our approach includes local and global progressive learning, shown by green and pink arrows. In the client's local learning, there are three steps: 1) Samples are rated by loss; higher $i$ in $S_i$ signifies greater loss (redder), lower means lesser (bluer). 2) Loss-based sorting identifies an adaptive threshold for classifying samples into 'unbiased' (low loss) or 'bias' (higher loss) sets. 3) Training begins with 'unbiased' samples, then integrates 'bias' ones over time, enabling comprehensive learning. Meanwhile, the global model learns biases across rounds, shrinking the 'bias' set.
  • Figure 2: Different samples on client have a different degree of drift to the model.
  • Figure 3: (a) Different impacts of various local samples on model drift. (b) The relationship between loss and uncertainty changes. (c) The relationship between uncertainty and loss as local samples vary. (d) The abrupt changes in the adaptive classification points in each round when a diversified knowledge acquisition stage is absent.
  • Figure 4: Result on DomainNet dataset
  • Figure 5: Ablation Study