Topology Learning for Heterogeneous Decentralized Federated Learning over Unreliable D2D Networks

Zheshun Wu; Zenglin Xu; Dun Zeng; Junfan Li; Jie Liu

Topology Learning for Heterogeneous Decentralized Federated Learning over Unreliable D2D Networks

Zheshun Wu, Zenglin Xu, Dun Zeng, Junfan Li, Jie Liu

TL;DR

This work addresses decentralized federated learning (DFL) over unreliable UDP-based device-to-device (D2D) networks with heterogeneous data distributions. It derives a convergence bound that introduces the unreliable-links-aware neighborhood discrepancy $\bar{H}$ and proposes ToLRDUL, a topology-learning method that minimizes $\bar{H}$ by jointly considering representation discrepancy and link outages via a Frank-Wolfe optimization over a sparse set of topologies. The approach uses Gaussian representations to approximate gradient discrepancies and exchanges compact encrypted statistics every $K$ rounds to reduce communication. Empirical results on Dirichlet and Rotated CIFAR-10 demonstrate faster convergence and higher test accuracy with ToLRDUL, while achieving lower latency than baselines, validating the theoretical claims and practical impact for robust DFL in unreliable D2D environments.

Abstract

With the proliferation of intelligent mobile devices in wireless device-to-device (D2D) networks, decentralized federated learning (DFL) has attracted significant interest. Compared to centralized federated learning (CFL), DFL mitigates the risk of central server failures due to communication bottlenecks. However, DFL faces several challenges, such as the severe heterogeneity of data distributions in diverse environments, and the transmission outages and package errors caused by the adoption of the User Datagram Protocol (UDP) in D2D networks. These challenges often degrade the convergence of training DFL models. To address these challenges, we conduct a thorough theoretical convergence analysis for DFL and derive a convergence bound. By defining a novel quantity named unreliable links-aware neighborhood discrepancy in this convergence bound, we formulate a tractable optimization objective, and develop a novel Topology Learning method considering the Representation Discrepancy and Unreliable Links in DFL, named ToLRDUL. Intensive experiments under both feature skew and label skew settings have validated the effectiveness of our proposed method, demonstrating improved convergence speed and test accuracy, consistent with our theoretical findings.

Topology Learning for Heterogeneous Decentralized Federated Learning over Unreliable D2D Networks

TL;DR

and proposes ToLRDUL, a topology-learning method that minimizes

by jointly considering representation discrepancy and link outages via a Frank-Wolfe optimization over a sparse set of topologies. The approach uses Gaussian representations to approximate gradient discrepancies and exchanges compact encrypted statistics every

rounds to reduce communication. Empirical results on Dirichlet and Rotated CIFAR-10 demonstrate faster convergence and higher test accuracy with ToLRDUL, while achieving lower latency than baselines, validating the theoretical claims and practical impact for robust DFL in unreliable D2D environments.

Abstract

Paper Structure (10 sections, 2 theorems, 16 equations, 2 figures, 2 tables, 1 algorithm)

This paper contains 10 sections, 2 theorems, 16 equations, 2 figures, 2 tables, 1 algorithm.

Introduction
System Model
Decentralized optimization over unreliable D2D networks
Transmission model of unreliable D2D networks
Convergence Analysis
Topology Learning
Numerical Results
Conclusion
Proof of Theorem \ref{['thm1']}
Proof of Theorem \ref{['lemm1']}

Key Result

Theorem 1

Let Assumption smoothness, variance and neighborbounded hold. We select the stepsize satisfying $\eta_t=\frac{\eta}{\sqrt{T}} \leq \frac{1}{\beta}$, and we have: where $\eta$ is a constant and $f^*$ denotes the minimal value of $f$.

Figures (2)

Figure 1: An illustration of DFL systems deployed in wireless D2D networks. In this system, the sensors serve as the DFL clients. The UDP protocol is used in D2D communication and the D2D links are unreliable.
Figure 2: Convergence analysis of DFL with ToLRDUL and other baselines on label-skew and feature skew CIFAR-10.

Theorems & Definitions (7)

Remark 1
Theorem 1
Remark 2
Theorem 2
Remark 3
Proof 1
Proof 2

Topology Learning for Heterogeneous Decentralized Federated Learning over Unreliable D2D Networks

TL;DR

Abstract

Topology Learning for Heterogeneous Decentralized Federated Learning over Unreliable D2D Networks

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (7)