ISFL: Federated Learning for Non-i.i.d. Data with Local Importance Sampling
Zheqi Zhu, Yuchen Shi, Pingyi Fan, Chenghui Peng, Khaled B. Letaief
TL;DR
ISFL tackles the gradient-diversity challenge of non-i.i.d. data in federated learning by introducing local importance sampling that reweights client-side training samples to better align with the global objective. The authors derive a convergence bound that explicitly accounts for local IS and formulate per-client optimal IS strategies via a water-filling optimization, enabling practical computation of IS weights. They design ISFL algorithms that update IS weights synchronously with federated rounds and validate the approach on CIFAR-10/100, showing improved accuracy, faster convergence, and better sampling efficiency compared to non-IS baselines. The work provides theoretical guarantees for neural networks and demonstrates that local IS can be a drop-in enhancement for diverse FL frameworks, improving robustness to label-skewed non-i.i.d. data and enabling more data-efficient training.
Abstract
As a promising learning paradigm integrating computation and communication, federated learning (FL) proceeds the local training and the periodic sharing from distributed clients. Due to the non-i.i.d. data distribution on clients, FL model suffers from the gradient diversity, poor performance, bad convergence, etc. In this work, we aim to tackle this key issue by adopting importance sampling (IS) for local training. We propose importance sampling federated learning (ISFL), an explicit framework with theoretical guarantees. Firstly, we derive the convergence theorem of ISFL to involve the effects of local importance sampling. Then, we formulate the problem of selecting optimal IS weights and obtain the theoretical solutions. We also employ a water-filling method to calculate the IS weights and develop the ISFL algorithms. The experimental results on CIFAR-10 fit the proposed theorems well and verify that ISFL reaps better performance, sampling efficiency, as well as explainability on non-i.i.d. data. To the best of our knowledge, ISFL is the first non-i.i.d. FL solution from the local sampling aspect which exhibits theoretical compatibility with neural network models. Furthermore, as a local sampling approach, ISFL can be easily migrated into other emerging FL frameworks.
