Table of Contents
Fetching ...

Adaptive Decentralized Federated Learning for Robust Optimization

Shuyuan Wu, Feifei Wang, Yuan Gao, Rui Wang, Hansheng Wang

TL;DR

The paper tackles robustness in decentralized federated learning (DFL) by addressing the detrimental impact of abnormal clients. It introduces adaptive decentralized federated learning (aDFL), which assigns per-client learning-rate weights based on gradient behavior to down-weight suspicious updates, without requiring prior knowledge of neighbors. The authors prove convergence results, including an oracle-property guarantee, and demonstrate superior robustness and efficiency through extensive synthetic and real-data experiments. The method preserves the original network topology and is applicable to heterogeneous data, with promising directions for privacy, communication efficiency, and broader robustness extensions.

Abstract

In decentralized federated learning (DFL), the presence of abnormal clients, often caused by noisy or poisoned data, can significantly disrupt the learning process and degrade the overall robustness of the model. Previous methods on this issue often require a sufficiently large number of normal neighboring clients or prior knowledge of reliable clients, which reduces the practical applicability of DFL. To address these limitations, we develop here a novel adaptive DFL (aDFL) approach for robust estimation. The key idea is to adaptively adjust the learning rates of clients. By assigning smaller rates to suspicious clients and larger rates to normal clients, aDFL mitigates the negative impact of abnormal clients on the global model in a fully adaptive way. Our theory does not put any stringent conditions on neighboring nodes and requires no prior knowledge. A rigorous convergence analysis is provided to guarantee the oracle property of aDFL. Extensive numerical experiments demonstrate the superior performance of the aDFL method.

Adaptive Decentralized Federated Learning for Robust Optimization

TL;DR

The paper tackles robustness in decentralized federated learning (DFL) by addressing the detrimental impact of abnormal clients. It introduces adaptive decentralized federated learning (aDFL), which assigns per-client learning-rate weights based on gradient behavior to down-weight suspicious updates, without requiring prior knowledge of neighbors. The authors prove convergence results, including an oracle-property guarantee, and demonstrate superior robustness and efficiency through extensive synthetic and real-data experiments. The method preserves the original network topology and is applicable to heterogeneous data, with promising directions for privacy, communication efficiency, and broader robustness extensions.

Abstract

In decentralized federated learning (DFL), the presence of abnormal clients, often caused by noisy or poisoned data, can significantly disrupt the learning process and degrade the overall robustness of the model. Previous methods on this issue often require a sufficiently large number of normal neighboring clients or prior knowledge of reliable clients, which reduces the practical applicability of DFL. To address these limitations, we develop here a novel adaptive DFL (aDFL) approach for robust estimation. The key idea is to adaptively adjust the learning rates of clients. By assigning smaller rates to suspicious clients and larger rates to normal clients, aDFL mitigates the negative impact of abnormal clients on the global model in a fully adaptive way. Our theory does not put any stringent conditions on neighboring nodes and requires no prior knowledge. A rigorous convergence analysis is provided to guarantee the oracle property of aDFL. Extensive numerical experiments demonstrate the superior performance of the aDFL method.

Paper Structure

This paper contains 14 sections, 5 theorems, 11 equations, 2 figures, 2 algorithms.

Key Result

Theorem 0

Assume Conditions ass:ps -- ass:bias hold. Further assume that $\varrho < \epsilon$ for some sufficiently small but fixed $\epsilon$ depending on $(L_{\max}, \lambda_{\min}, \rho)$. Then we have ${\mathbb{E}}\|\widehat{\theta} - \theta_0\|^2 = V(\widehat{\theta}) + \|\overline{\flat}_{\mathcal{A}}\ Here $\overline{\flat}_{\mathcal{A}} = |\mathcal{A}|^{-1} \sum_{m \in \mathcal{A}} \flat_m$. The de

Figures (2)

  • Figure 1: The logarithm of MSE values versus the fraction of abnormal clients ($\varrho$) under the Directed Circle Network and the homogeneous scenario. Different algorithms are evaluated under different corruption types and two in-degrees ($D$).
  • Figure 2: The testing accuracy over iterations for CIFAR10 in the heterogeneous scenario. Different methods are evaluated with varying link probabilities ($q$) and the fraction of abnormal clients ($\varrho$) under the LF corruption and Erdős–Rényi Graph.

Theorems & Definitions (5)

  • Theorem 0: MSE of $\widehat{\theta}$
  • Proposition 1: Convergence Property of the Standard DFL
  • Theorem 1: Convergence property of $\widehat{\theta}^{*(t)}_{\mathcal{A}}$
  • Theorem 2: Convergence rate of the aDFL
  • Corollary 1: General initial estimator