Federated Learning with Sample-level Client Drift Mitigation
Haoran Xu, Jiaze Li, Wanyi Wu, Hao Ren
TL;DR
This work tackles the non-IID data challenge in Federated Learning by revealing that client drift originates from biases across individual local samples and that these biases vary dynamically during training. It introduces FedBSS, a two-stage approach with a warmup phase for diversified knowledge and a bias-aware progressive learning stage that uses loss-based sample ranking and an uncertainty-derived boundary to partition samples into unbiased and biased groups, gradually incorporating biased samples via a cosine-based schedule. Empirical results across label-skew, feature-skew, and noisy-label scenarios show that FedBSS consistently outperforms state-of-the-art baselines, with pronounced gains on larger models and more heterogeneous data, demonstrating scalability and robustness. The method offers a practical mechanism to mitigate heterogeneity in real-world FL deployments by leveraging fine-grained sample information rather than solely focusing on global-update calibration or server-side aggregation strategies.
Abstract
Federated Learning (FL) suffers from severe performance degradation due to the data heterogeneity among clients. Existing works reveal that the fundamental reason is that data heterogeneity can cause client drift where the local model update deviates from the global one, and thus they usually tackle this problem from the perspective of calibrating the obtained local update. Despite effectiveness, existing methods substantially lack a deep understanding of how heterogeneous data samples contribute to the formation of client drift. In this paper, we bridge this gap by identifying that the drift can be viewed as a cumulative manifestation of biases present in all local samples and the bias between samples is different. Besides, the bias dynamically changes as the FL training progresses. Motivated by this, we propose FedBSS that first mitigates the heterogeneity issue in a sample-level manner, orthogonal to existing methods. Specifically, the core idea of our method is to adopt a bias-aware sample selection scheme that dynamically selects the samples from small biases to large epoch by epoch to train progressively the local model in each round. In order to ensure the stability of training, we set the diversified knowledge acquisition stage as the warm-up stage to avoid the local optimality caused by knowledge deviation in the early stage of the model. Evaluation results show that FedBSS outperforms state-of-the-art baselines. In addition, we also achieved effective results on feature distribution skew and noise label dataset setting, which proves that FedBSS can not only reduce heterogeneity, but also has scalability and robustness.
