Table of Contents
Fetching ...

Exploring the Practicality of Federated Learning: A Survey Towards the Communication Perspective

Khiem Le, Nhan Luong-Ha, Manh Nguyen-Duc, Danh Le-Phuoc, Cuong Do, Kok-Seng Wong

TL;DR

This survey tackles the practical challenge of communication efficiency in Federated Learning by defining quantitative measures of communication overhead, identifying its primary sources, and presenting a comprehensive taxonomy of methods to reduce rounds, client participation, and network burdens. It reviews both centralized and decentralized FL frameworks, assesses open-source programming environments, and surveys an extensive set of techniques—including regularization, aggregation correction, one-shot updating, dynamic client selection, quantization, sparsification, factorization, and distillation. The authors also discuss FL architectures (hierarchical and peer-to-peer), and articulate future directions such as hybrid communication strategies, dynamic participation, transfer learning, and privacy-preserving mechanisms, all aimed at enabling scalable, privacy-preserving FL in real-world deployments. The work synthesizes concrete metrics, methodological trends, and architectural considerations to advance the practicality and adoption of FL in diverse domains like IoT, healthcare, and finance, highlighting trade-offs between model performance, privacy, and communication cost. Overall, the paper provides a structured roadmap for designing, evaluating, and deploying communication-efficient FL systems at scale.

Abstract

Federated Learning (FL) is a promising paradigm that offers significant advancements in privacy-preserving, decentralized machine learning by enabling collaborative training of models across distributed devices without centralizing data. However, the practical deployment of FL systems faces a significant bottleneck: the communication overhead caused by frequently exchanging large model updates between numerous devices and a central server. This communication inefficiency can hinder training speed, model performance, and the overall feasibility of real-world FL applications. In this survey, we investigate various strategies and advancements made in communication-efficient FL, highlighting their impact and potential to overcome the communication challenges inherent in FL systems. Specifically, we define measures for communication efficiency, analyze sources of communication inefficiency in FL systems, and provide a taxonomy and comprehensive review of state-of-the-art communication-efficient FL methods. Additionally, we discuss promising future research directions for enhancing the communication efficiency of FL systems. By addressing the communication bottleneck, FL can be effectively applied and enable scalable and practical deployment across diverse applications that require privacy-preserving, decentralized machine learning, such as IoT, healthcare, or finance.

Exploring the Practicality of Federated Learning: A Survey Towards the Communication Perspective

TL;DR

This survey tackles the practical challenge of communication efficiency in Federated Learning by defining quantitative measures of communication overhead, identifying its primary sources, and presenting a comprehensive taxonomy of methods to reduce rounds, client participation, and network burdens. It reviews both centralized and decentralized FL frameworks, assesses open-source programming environments, and surveys an extensive set of techniques—including regularization, aggregation correction, one-shot updating, dynamic client selection, quantization, sparsification, factorization, and distillation. The authors also discuss FL architectures (hierarchical and peer-to-peer), and articulate future directions such as hybrid communication strategies, dynamic participation, transfer learning, and privacy-preserving mechanisms, all aimed at enabling scalable, privacy-preserving FL in real-world deployments. The work synthesizes concrete metrics, methodological trends, and architectural considerations to advance the practicality and adoption of FL in diverse domains like IoT, healthcare, and finance, highlighting trade-offs between model performance, privacy, and communication cost. Overall, the paper provides a structured roadmap for designing, evaluating, and deploying communication-efficient FL systems at scale.

Abstract

Federated Learning (FL) is a promising paradigm that offers significant advancements in privacy-preserving, decentralized machine learning by enabling collaborative training of models across distributed devices without centralizing data. However, the practical deployment of FL systems faces a significant bottleneck: the communication overhead caused by frequently exchanging large model updates between numerous devices and a central server. This communication inefficiency can hinder training speed, model performance, and the overall feasibility of real-world FL applications. In this survey, we investigate various strategies and advancements made in communication-efficient FL, highlighting their impact and potential to overcome the communication challenges inherent in FL systems. Specifically, we define measures for communication efficiency, analyze sources of communication inefficiency in FL systems, and provide a taxonomy and comprehensive review of state-of-the-art communication-efficient FL methods. Additionally, we discuss promising future research directions for enhancing the communication efficiency of FL systems. By addressing the communication bottleneck, FL can be effectively applied and enable scalable and practical deployment across diverse applications that require privacy-preserving, decentralized machine learning, such as IoT, healthcare, or finance.
Paper Structure (37 sections, 3 equations, 8 figures, 6 tables, 2 algorithms)

This paper contains 37 sections, 3 equations, 8 figures, 6 tables, 2 algorithms.

Figures (8)

  • Figure 1.1: (a) A standard FL system workflow: Client devices train a model based on its private data and upload its trained weights to a central server where all the weights are aggregated and sent back to the clients. (b) Number of articles containing "Federated Learning" & "Machine Learning" on Google Scholar by year.
  • Figure 1.2: Overview of the survey structure.
  • Figure 2.1: Overview of IoT deep learning paradigms. TODO: Update distributed learning section to include model icon
  • Figure 2.2: An illustration of two distinct implementations in FL containing 4 participating clients and 3 updates. The naive one includes 3 rounds with 1 update each, while the FedAvg employs 1 round with 3 local updates.
  • Figure 4.1: A taxonomy of communication-efficient FL methods covered in this survey. Firstly, methods aimed at reducing a number of communication rounds are discussed in Section \ref{['sec-number-of-communication-rounds']}. Secondly, methods aimed at reducing a number of participating clients (client selection) are discussed in Section \ref{['sec-number of clients']}. Lastly, methods aimed at reducing burdens on the network via model compression are discussed in Section \ref{['sec-burdens on the network']}.
  • ...and 3 more figures