Table of Contents
Fetching ...

RESIST: Resilient Decentralized Learning Using Consensus Gradient Descent

Cheng Fang, Rishabh Dixit, Waheed U. Bajwa, Mert Gurbuzbalaban

TL;DR

RESIST tackles resilient decentralized learning under dynamic MITM attacks by combining consensus with coordinate-wise trimmed-mean filtering, enabling robust learning without a central server. It delivers geometric convergence for strongly convex and PL losses and sublinear rates for smooth nonconvex objectives, accompanied by statistical learning-rate guarantees that scale with the number of nodes. The approach unifies MITM and Byzantine defenses, achieving exact convergence when local losses align and maintaining strong performance under various attack and data-distribution regimes. Empirical results on MNIST and CIFAR-10 demonstrate RESIST’s resilience and scalability in adversarial network environments, highlighting its practical relevance for privacy-preserving, distributed ML systems.

Abstract

Empirical risk minimization (ERM) is a cornerstone of modern machine learning (ML), supported by advances in optimization theory that ensure efficient solutions with provable algorithmic convergence rates, which measure the speed at which optimization algorithms approach a solution, and statistical learning rates, which characterize how well the solution generalizes to unseen data. Privacy, memory, computational, and communications constraints increasingly necessitate data collection, processing, and storage across network-connected devices. In many applications, these networks operate in decentralized settings where a central server cannot be assumed, requiring decentralized ML algorithms that are both efficient and resilient. Decentralized learning, however, faces significant challenges, including an increased attack surface for adversarial interference during decentralized learning processes. This paper focuses on the man-in-the-middle (MITM) attack, which can cause models to deviate significantly from their intended ERM solutions. To address this challenge, we propose RESIST (Resilient dEcentralized learning using conSensus gradIent deScenT), an optimization algorithm designed to be robust against adversarially compromised communication links. RESIST achieves algorithmic and statistical convergence for strongly convex, Polyak-Lojasiewicz, and nonconvex ERM problems. Experimental results demonstrate the robustness and scalability of RESIST for real-world decentralized learning in adversarial environments.

RESIST: Resilient Decentralized Learning Using Consensus Gradient Descent

TL;DR

RESIST tackles resilient decentralized learning under dynamic MITM attacks by combining consensus with coordinate-wise trimmed-mean filtering, enabling robust learning without a central server. It delivers geometric convergence for strongly convex and PL losses and sublinear rates for smooth nonconvex objectives, accompanied by statistical learning-rate guarantees that scale with the number of nodes. The approach unifies MITM and Byzantine defenses, achieving exact convergence when local losses align and maintaining strong performance under various attack and data-distribution regimes. Empirical results on MNIST and CIFAR-10 demonstrate RESIST’s resilience and scalability in adversarial network environments, highlighting its practical relevance for privacy-preserving, distributed ML systems.

Abstract

Empirical risk minimization (ERM) is a cornerstone of modern machine learning (ML), supported by advances in optimization theory that ensure efficient solutions with provable algorithmic convergence rates, which measure the speed at which optimization algorithms approach a solution, and statistical learning rates, which characterize how well the solution generalizes to unseen data. Privacy, memory, computational, and communications constraints increasingly necessitate data collection, processing, and storage across network-connected devices. In many applications, these networks operate in decentralized settings where a central server cannot be assumed, requiring decentralized ML algorithms that are both efficient and resilient. Decentralized learning, however, faces significant challenges, including an increased attack surface for adversarial interference during decentralized learning processes. This paper focuses on the man-in-the-middle (MITM) attack, which can cause models to deviate significantly from their intended ERM solutions. To address this challenge, we propose RESIST (Resilient dEcentralized learning using conSensus gradIent deScenT), an optimization algorithm designed to be robust against adversarially compromised communication links. RESIST achieves algorithmic and statistical convergence for strongly convex, Polyak-Lojasiewicz, and nonconvex ERM problems. Experimental results demonstrate the robustness and scalability of RESIST for real-world decentralized learning in adversarial environments.

Paper Structure

This paper contains 77 sections, 36 theorems, 252 equations, 14 figures, 1 table, 2 algorithms.

Key Result

Lemma 3.4

Let $\mathbf{W}(t) \in \mathbb{R}^{M \times d}$ be the state matrix whose $i$-th row corresponds to the transpose of the state vector $\mathbf{w}_i(t) \in \mathbb{R}^d$ at node $i$, as given in Algorithm gradient descent algorithm. Under Assumption claim2, the mixing step (Step RESIST: CWTM) in Algo where the entries of $\mathbf{Y}_k(t)$, the mixing matrix with zero entries corresponding to compro

Figures (14)

  • Figure 1: Illustrations of different system architectures and adversarial attack models: (a) A distributed system with centralized coordination, where a central server manages the training process. (b) A decentralized system, where nodes collaborate without central coordination. (c) A decentralized system under a Byzantine attack, where one of the five nodes is compromised (colored red) and sends arbitrary or corrupted values to its neighbors through red-colored links. (d) A decentralized system under a man-in-the-middle (MITM) attack, where two communication links are under attack (colored red), allowing the attacker to alter the transmitted information before it is received, even though no nodes are compromised. These attacked links can change over time, making the communication vulnerabilities dynamic. A discussion of the mathematical mapping of the Byzantine attack problem to the MITM attack problem is provided in Sec. \ref{['mapping']}.
  • Figure 2: Performance comparison of RESIST between different choices of parameter $J$ when the graph and the attack remain the same
  • Figure 3: Comparison of RESIST and DGD with different choices of compromised links in the network
  • Figure 4: Comparison of RESIST with network of different sizes
  • Figure 5: Comparison of RESIST, RESIST-M, K, and B with two and four compromised links
  • ...and 9 more figures

Theorems & Definitions (66)

  • Example 2.1
  • Definition 3.1: Source node and source component
  • Definition 3.2: Filtered graph topology
  • Lemma 3.4
  • Remark 3.5
  • Corollary 4.1
  • proof
  • Definition 4.2
  • Definition 4.3
  • Lemma 4.4
  • ...and 56 more