Table of Contents
Fetching ...

FedSAE: A Novel Self-Adaptive Federated Learning Framework in Heterogeneous Systems

Li Li, Moming Duan, Duo Liu, Yu Zhang, Ao Ren, Xianzhang Chen, Yujuan Tan, Chengliang Wang

TL;DR

FedSAE tackles the problem of systems heterogeneity in federated learning by predicting each client's affordable workload from historical task completion and by selecting high-value clients through Active Learning. It introduces two self-adaptive algorithms, FedSAE-Ira and FedSAE-Fassa, that adjust workloads and improve participant selection while preserving privacy. Across FEMNIST, MNIST, Sent140, and Synthetic datasets, FedSAE achieves about $26.7\%$ improvements in absolute test accuracy and reduces stragglers by roughly $90\%$, outperforming the vanilla FedAvg with faster convergence. This work offers a practical path to robust FL in highly heterogeneous environments and points to future work on convergence analysis and imbalanced data handling.

Abstract

Federated Learning (FL) is a novel distributed machine learning which allows thousands of edge devices to train model locally without uploading data concentrically to the server. But since real federated settings are resource-constrained, FL is encountered with systems heterogeneity which causes a lot of stragglers directly and then leads to significantly accuracy reduction indirectly. To solve the problems caused by systems heterogeneity, we introduce a novel self-adaptive federated framework FedSAE which adjusts the training task of devices automatically and selects participants actively to alleviate the performance degradation. In this work, we 1) propose FedSAE which leverages the complete information of devices' historical training tasks to predict the affordable training workloads for each device. In this way, FedSAE can estimate the reliability of each device and self-adaptively adjust the amount of training load per client in each round. 2) combine our framework with Active Learning to self-adaptively select participants. Then the framework accelerates the convergence of the global model. In our framework, the server evaluates devices' value of training based on their training loss. Then the server selects those clients with bigger value for the global model to reduce communication overhead. The experimental result indicates that in a highly heterogeneous system, FedSAE converges faster than FedAvg, the vanilla FL framework. Furthermore, FedSAE outperforms than FedAvg on several federated datasets - FedSAE improves test accuracy by 26.7% and reduces stragglers by 90.3% on average.

FedSAE: A Novel Self-Adaptive Federated Learning Framework in Heterogeneous Systems

TL;DR

FedSAE tackles the problem of systems heterogeneity in federated learning by predicting each client's affordable workload from historical task completion and by selecting high-value clients through Active Learning. It introduces two self-adaptive algorithms, FedSAE-Ira and FedSAE-Fassa, that adjust workloads and improve participant selection while preserving privacy. Across FEMNIST, MNIST, Sent140, and Synthetic datasets, FedSAE achieves about improvements in absolute test accuracy and reduces stragglers by roughly , outperforming the vanilla FedAvg with faster convergence. This work offers a practical path to robust FL in highly heterogeneous environments and points to future work on convergence analysis and imbalanced data handling.

Abstract

Federated Learning (FL) is a novel distributed machine learning which allows thousands of edge devices to train model locally without uploading data concentrically to the server. But since real federated settings are resource-constrained, FL is encountered with systems heterogeneity which causes a lot of stragglers directly and then leads to significantly accuracy reduction indirectly. To solve the problems caused by systems heterogeneity, we introduce a novel self-adaptive federated framework FedSAE which adjusts the training task of devices automatically and selects participants actively to alleviate the performance degradation. In this work, we 1) propose FedSAE which leverages the complete information of devices' historical training tasks to predict the affordable training workloads for each device. In this way, FedSAE can estimate the reliability of each device and self-adaptively adjust the amount of training load per client in each round. 2) combine our framework with Active Learning to self-adaptively select participants. Then the framework accelerates the convergence of the global model. In our framework, the server evaluates devices' value of training based on their training loss. Then the server selects those clients with bigger value for the global model to reduce communication overhead. The experimental result indicates that in a highly heterogeneous system, FedSAE converges faster than FedAvg, the vanilla FL framework. Furthermore, FedSAE outperforms than FedAvg on several federated datasets - FedSAE improves test accuracy by 26.7% and reduces stragglers by 90.3% on average.

Paper Structure

This paper contains 14 sections, 7 equations, 8 figures, 3 tables, 3 algorithms.

Figures (8)

  • Figure 1: The results of FedAvg trained on FEMNIST and MNIST datasets. The real affordable local epoch of each client is varying while the training epoch allocated to clients is fixed (i.e. $epoch=10,12,15,20$). With allocated epoch increasing from 10 to 20, averaging testing accuracy reduces by up to 53.3% on MNIST, and the averaging drop out rate increases by up to 41% on FEMNIST, which means considering systems heterogeneity, the performance of FedAvg declines heavily. That's because many clients do not finish their assignments.
  • Figure 2: The workflow of FedSAE.
  • Figure 3: The simple process diagram of FedSAE-Ira. If the client drops out at round $t$, the predicted workload of the next round $t+1$ turns to be halved. If not, it keeps increasing in inverse proportion e.g. client $k$ at $t$-th round increased by $\frac{\mathcal{U}}{E_k^{t}}$.
  • Figure 4: The simple process diagram of FedSAE-Fassa. Each round clients calculate their workload threshold $\theta$ according to the historical information. In the start stage where the clients' local epoch less than $\theta$ and clients' local epoch increases by $\gamma_{1}$ each round. In the arise stage where the number of client’s local epoch is greater than $\theta$ and clients' local epoch increases by $\gamma_{2}$, where $\gamma_{1}$$\textgreater$$\gamma_{2}$. Also, if clients drop out then the local epoch of the next round will be halved.
  • Figure 5: Effects of choosing different inverse ratio parameter $\mathcal{U}$. We show the experimental testing accuracy of global model on FEMNIST and MNIST datasets when $\mathcal{U} = 1, 2, 3, 10$. Empirically, we find that when $\mathcal{U} = 10$, FedSAE-Ira works well.
  • ...and 3 more figures