Table of Contents
Fetching ...

Compressed Gradient Tracking for Decentralized Optimization Over General Directed Networks

Zhuoqing Song, Lei Shi, Shi Pu, Ming Yan

TL;DR

This work addresses decentralized optimization on general directed graphs by introducing CPP, which fuses gradient-tracking Push-Pull with unbiased compression to cut communication while preserving linear convergence for $\mu$-strongly convex and $L$-smooth objectives. It further adds B-CPP, an asynchronous broadcast variant that maintains linear convergence with even lower communication load. Theoretical results establish contraction via spectral radii of carefully constructed matrices and provide conditions on stepsizes and momentum to guarantee convergence; empirical results on logistic regression with compressed communication validate the methods and quantify communication gains. Overall, the paper delivers scalable, communication-efficient decentralized optimization tools applicable to large, privacy-conscious multi-agent systems.

Abstract

In this paper, we propose two communication efficient decentralized optimization algorithms over a general directed multi-agent network. The first algorithm, termed Compressed Push-Pull (CPP), combines the gradient tracking Push-Pull method with communication compression. We show that CPP is applicable to a general class of unbiased compression operators and achieves linear convergence rate for strongly convex and smooth objective functions. The second algorithm is a broadcast-like version of CPP (B-CPP), and it also achieves linear convergence rate under the same conditions on the objective functions. B-CPP can be applied in an asynchronous broadcast setting and further reduce communication costs compared to CPP. Numerical experiments complement the theoretical analysis and confirm the effectiveness of the proposed methods.

Compressed Gradient Tracking for Decentralized Optimization Over General Directed Networks

TL;DR

This work addresses decentralized optimization on general directed graphs by introducing CPP, which fuses gradient-tracking Push-Pull with unbiased compression to cut communication while preserving linear convergence for -strongly convex and -smooth objectives. It further adds B-CPP, an asynchronous broadcast variant that maintains linear convergence with even lower communication load. Theoretical results establish contraction via spectral radii of carefully constructed matrices and provide conditions on stepsizes and momentum to guarantee convergence; empirical results on logistic regression with compressed communication validate the methods and quantify communication gains. Overall, the paper delivers scalable, communication-efficient decentralized optimization tools applicable to large, privacy-conscious multi-agent systems.

Abstract

In this paper, we propose two communication efficient decentralized optimization algorithms over a general directed multi-agent network. The first algorithm, termed Compressed Push-Pull (CPP), combines the gradient tracking Push-Pull method with communication compression. We show that CPP is applicable to a general class of unbiased compression operators and achieves linear convergence rate for strongly convex and smooth objective functions. The second algorithm is a broadcast-like version of CPP (B-CPP), and it also achieves linear convergence rate under the same conditions on the objective functions. B-CPP can be applied in an asynchronous broadcast setting and further reduce communication costs compared to CPP. Numerical experiments complement the theoretical analysis and confirm the effectiveness of the proposed methods.

Paper Structure

This paper contains 13 sections, 11 theorems, 103 equations, 5 figures, 2 algorithms.

Key Result

Lemma 1

For any matrices $\boldsymbol{\mathit{A}}\in\mathbb{R}^{n\times p}$, $\boldsymbol{\mathit{W}}\in\mathbb{R}^{n\times n}$, and a vector norm $\left\| \cdot \right\|_{*}$, we have ${\left\vert\left\vert\left\vert \boldsymbol{\mathit{W}}\boldsymbol{\mathit{A}} \right\vert\right\vert\right\vert}_{*} \leq

Figures (5)

  • Figure 1: Linear convergence of Push-Pull/$\mathcal{A}\mathcal{B}$, CPP, and B-CPP with $b$ bit quantization ($b = 2, 4, 6$) and Rand-k ($k = 5, 10, 20$) compressors.
  • Figure 2: Linear convergence of B-CPP with $b$ bit quantization ($b = 2, 4, 6$) and Rand-k ($k = 5, 10, 20$) compressors.
  • Figure 3: Performance of Push-Pull/$\mathcal{A}\mathcal{B}$, CPP, B-CPP against the number of transmitted bits: the left column shows the results with quantization ($b = 2, 4, 6$) and the right column shows the results with Rand-k ($k = 5, 10, 20$).
  • Figure 4: Performance of CPP and Push-Pull/$\mathcal{A}\mathcal{B}$ with different communication networks under both quantization and Rand-k compressors.
  • Figure 5: Performance of B-CPP with different communication networks under both quantization and Rand-k compressors.

Theorems & Definitions (25)

  • Definition 1
  • Definition 2
  • Lemma 1: Lemma 5 in pu2020push
  • Lemma 2
  • proof
  • Lemma 3
  • proof
  • Lemma 4
  • proof
  • Lemma 5
  • ...and 15 more