FedSAE: A Novel Self-Adaptive Federated Learning Framework in Heterogeneous Systems
Li Li, Moming Duan, Duo Liu, Yu Zhang, Ao Ren, Xianzhang Chen, Yujuan Tan, Chengliang Wang
TL;DR
FedSAE tackles the problem of systems heterogeneity in federated learning by predicting each client's affordable workload from historical task completion and by selecting high-value clients through Active Learning. It introduces two self-adaptive algorithms, FedSAE-Ira and FedSAE-Fassa, that adjust workloads and improve participant selection while preserving privacy. Across FEMNIST, MNIST, Sent140, and Synthetic datasets, FedSAE achieves about $26.7\%$ improvements in absolute test accuracy and reduces stragglers by roughly $90\%$, outperforming the vanilla FedAvg with faster convergence. This work offers a practical path to robust FL in highly heterogeneous environments and points to future work on convergence analysis and imbalanced data handling.
Abstract
Federated Learning (FL) is a novel distributed machine learning which allows thousands of edge devices to train model locally without uploading data concentrically to the server. But since real federated settings are resource-constrained, FL is encountered with systems heterogeneity which causes a lot of stragglers directly and then leads to significantly accuracy reduction indirectly. To solve the problems caused by systems heterogeneity, we introduce a novel self-adaptive federated framework FedSAE which adjusts the training task of devices automatically and selects participants actively to alleviate the performance degradation. In this work, we 1) propose FedSAE which leverages the complete information of devices' historical training tasks to predict the affordable training workloads for each device. In this way, FedSAE can estimate the reliability of each device and self-adaptively adjust the amount of training load per client in each round. 2) combine our framework with Active Learning to self-adaptively select participants. Then the framework accelerates the convergence of the global model. In our framework, the server evaluates devices' value of training based on their training loss. Then the server selects those clients with bigger value for the global model to reduce communication overhead. The experimental result indicates that in a highly heterogeneous system, FedSAE converges faster than FedAvg, the vanilla FL framework. Furthermore, FedSAE outperforms than FedAvg on several federated datasets - FedSAE improves test accuracy by 26.7% and reduces stragglers by 90.3% on average.
