Table of Contents
Fetching ...

Decentralized Federated Learning: A Survey and Perspective

Liangqi Yuan, Ziran Wang, Lichao Sun, Philip S. Yu, Christopher G. Brinton

TL;DR

This paper addresses the gap in a systematic, design-oriented understanding of decentralized federated learning (DFL) by introducing a structured taxonomy along five axes: iteration order, communication protocol, network topology, paradigm, and temporal variability. It defines two learning paradigms—Continual and Aggregate—and proposes topology-based variants (Line, Ring, Mesh, Star, Hybrid) with real-world deployment scenarios and challenges. By contrasting CFL with DFL, surveying real-world applications, and analyzing security, incentives, and management issues, the work provides concrete guidance for building scalable, private, and robust serverless learning systems. The contributions offer a practical roadmap for designing DFL architectures that balance personalization, generalization, communication efficiency, and resilience, thereby catalyzing future research and real-world adoption.

Abstract

Federated learning (FL) has been gaining attention for its ability to share knowledge while maintaining user data, protecting privacy, increasing learning efficiency, and reducing communication overhead. Decentralized FL (DFL) is a decentralized network architecture that eliminates the need for a central server in contrast to centralized FL (CFL). DFL enables direct communication between clients, resulting in significant savings in communication resources. In this paper, a comprehensive survey and profound perspective are provided for DFL. First, a review of the methodology, challenges, and variants of CFL is conducted, laying the background of DFL. Then, a systematic and detailed perspective on DFL is introduced, including iteration order, communication protocols, network topologies, paradigm proposals, and temporal variability. Next, based on the definition of DFL, several extended variants and categorizations are proposed with state-of-the-art (SOTA) technologies. Lastly, in addition to summarizing the current challenges in the DFL, some possible solutions and future research directions are also discussed.

Decentralized Federated Learning: A Survey and Perspective

TL;DR

This paper addresses the gap in a systematic, design-oriented understanding of decentralized federated learning (DFL) by introducing a structured taxonomy along five axes: iteration order, communication protocol, network topology, paradigm, and temporal variability. It defines two learning paradigms—Continual and Aggregate—and proposes topology-based variants (Line, Ring, Mesh, Star, Hybrid) with real-world deployment scenarios and challenges. By contrasting CFL with DFL, surveying real-world applications, and analyzing security, incentives, and management issues, the work provides concrete guidance for building scalable, private, and robust serverless learning systems. The contributions offer a practical roadmap for designing DFL architectures that balance personalization, generalization, communication efficiency, and resilience, thereby catalyzing future research and real-world adoption.

Abstract

Federated learning (FL) has been gaining attention for its ability to share knowledge while maintaining user data, protecting privacy, increasing learning efficiency, and reducing communication overhead. Decentralized FL (DFL) is a decentralized network architecture that eliminates the need for a central server in contrast to centralized FL (CFL). DFL enables direct communication between clients, resulting in significant savings in communication resources. In this paper, a comprehensive survey and profound perspective are provided for DFL. First, a review of the methodology, challenges, and variants of CFL is conducted, laying the background of DFL. Then, a systematic and detailed perspective on DFL is introduced, including iteration order, communication protocols, network topologies, paradigm proposals, and temporal variability. Next, based on the definition of DFL, several extended variants and categorizations are proposed with state-of-the-art (SOTA) technologies. Lastly, in addition to summarizing the current challenges in the DFL, some possible solutions and future research directions are also discussed.
Paper Structure (25 sections, 6 figures, 4 tables, 2 algorithms)

This paper contains 25 sections, 6 figures, 4 tables, 2 algorithms.

Figures (6)

  • Figure 1: Comparative analysis between centralized FL and decentralized FL across various performance metrics. Each axis represents a metric with the plotted values indicating the relative strength of the respective FL approach in that domain.
  • Figure 2: Illustration of local learning, centralized learning, CFL, and DFL. (a) Clients are trained with local user data only. The clients neither share raw data nor communicate with each other. (b) After clients send the user data packets to the server, the server trains a general model using all the data. The generalized model is then shared with all clients. (c) Clients send the locally trained model parameters to the server. The server aggregates all the local models and then transmits the aggregated global model parameters to all the clients. (d) Clients share their locally trained model with other clients. Subsequent clients then continue to learn, personalize, and adapt the model locally, while also exchanging and propagating the model parameters that possess local knowledge.
  • Figure 3: Roadmap for this perspective paper.
  • Figure 4: Illustration of communication network topology.
  • Figure 5: Illustration of the two paradigms, Continual and Aggregate, for sequential pointing line DFL in the parameter space, showcasing their respective learning and communication processes. The length of the arrow represents both the learning difficulty and the magnitude of the model parameters that undergo changes during learning, which can be measured using the $\ell_2$ norm. Shorter arrows are desired as they indicate more accessible, stable, and accurate model learning and convergence. Excessively long arrows suggest that the given loss function and learning rate may not produce the desired model outcome.
  • ...and 1 more figures