Table of Contents
Fetching ...

Convergence result for the gradient-push algorithm and its application to boost up the Push-DIging algorithm

Hyogi Choi, Woocheol Choi, Gwangil Kim

TL;DR

The paper addresses distributed optimization over directed graphs using the gradient-push algorithm with a constant stepsize. It introduces a contraction-based analysis via the operator $T_{\alpha}$ and proves that for $\alpha\in(0,\alpha_0]$ the iterates converge linearly to a fixed point $w^{\alpha}$, achieving an $O(\alpha)$-neighborhood of the global minimizer $x_*$. Two function classes are handled: (i) each $f_i$ is $\mu_i$-strongly convex and $L_i$-smooth, and (ii) each $f_i$ is convex quadratic with an $L_i$-smooth aggregate, both yielding an $O(\alpha)$-accurate limit with $\alpha_0$ independent of $L$. A hybrid scheme combining gradient-push with Push-DIGing is proposed to accelerate convergence, and numerical experiments demonstrate substantial performance gains. Overall, the work provides sharp, scalable convergence guarantees for constant-step-size distributed optimization on directed graphs and offers a practical pathway to faster consensus-based learning in networked systems.

Abstract

The gradient-push algorithm is a fundamental algorithm for the distributed optimization problem \begin{equation} \min_{x \in \mathbb{R}^d} f(x) = \sum_{j=1}^n f_j (x), \end{equation} where each local cost $f_j$ is only known to agent $a_i$ for $1 \leq i \leq n$ and the agents are connected by a directed graph. In this paper, we obtain convergence results for the gradient-push algorithm with constant stepsize whose range is sharp in terms the order of the smoothness constant $L>0$. Precisely, under the two settings: 1) Each local cost $f_i$ is strongly convex and $L$-smooth, 2) Each local cost $f_i$ is convex quadratic and $L$-smooth while the aggregate cost $f$ is strongly convex, we show that the gradient-push algorithm with stepsize $α>0$ converges to an $O(α)$-neighborhood of the minimizer of $f$ for a range $α\in (0, c/L]$ with a value $c>0$ independent of $L>0$. As a benefit of the result, we suggest a hybrid algorithm that performs the gradient-push algorithm with a relatively large stepsize $α>0$ for a number of iterations and then go over to perform the Push-DIGing algorithm. It is verified by a numerical test that the hybrid algorithm enhances the performance of the Push-DIGing algorithm significantly. The convergence results of the gradient-push algorithm are also supported by numerical tests.

Convergence result for the gradient-push algorithm and its application to boost up the Push-DIging algorithm

TL;DR

The paper addresses distributed optimization over directed graphs using the gradient-push algorithm with a constant stepsize. It introduces a contraction-based analysis via the operator and proves that for the iterates converge linearly to a fixed point , achieving an -neighborhood of the global minimizer . Two function classes are handled: (i) each is -strongly convex and -smooth, and (ii) each is convex quadratic with an -smooth aggregate, both yielding an -accurate limit with independent of . A hybrid scheme combining gradient-push with Push-DIGing is proposed to accelerate convergence, and numerical experiments demonstrate substantial performance gains. Overall, the work provides sharp, scalable convergence guarantees for constant-step-size distributed optimization on directed graphs and offers a practical pathway to faster consensus-based learning in networked systems.

Abstract

The gradient-push algorithm is a fundamental algorithm for the distributed optimization problem \begin{equation} \min_{x \in \mathbb{R}^d} f(x) = \sum_{j=1}^n f_j (x), \end{equation} where each local cost is only known to agent for and the agents are connected by a directed graph. In this paper, we obtain convergence results for the gradient-push algorithm with constant stepsize whose range is sharp in terms the order of the smoothness constant . Precisely, under the two settings: 1) Each local cost is strongly convex and -smooth, 2) Each local cost is convex quadratic and -smooth while the aggregate cost is strongly convex, we show that the gradient-push algorithm with stepsize converges to an -neighborhood of the minimizer of for a range with a value independent of . As a benefit of the result, we suggest a hybrid algorithm that performs the gradient-push algorithm with a relatively large stepsize for a number of iterations and then go over to perform the Push-DIGing algorithm. It is verified by a numerical test that the hybrid algorithm enhances the performance of the Push-DIGing algorithm significantly. The convergence results of the gradient-push algorithm are also supported by numerical tests.
Paper Structure (11 sections, 18 theorems, 124 equations, 7 figures, 1 table, 1 algorithm)

This paper contains 11 sections, 18 theorems, 124 equations, 7 figures, 1 table, 1 algorithm.

Key Result

Theorem 1.2

Suppose one of the following conditions holds true: Then for each $1\leq k \leq n$, the sequence $\{z_k (t)\}_{t \geq 0}$ of the gradient-push algorithm eq-1-2 with stepsize $\alpha \in (0,\alpha_0]$ converges linearly to an $O(\alpha)$-neighborhood of the minimizer of $f$.

Figures (7)

  • Figure 1: The flow of the main theorems for proving the convergence of the gradient-push algorithm \ref{['eq-1-2']}
  • Figure 2: Left : The graphs of $\log_{10}(\sum_{k=1}^n \|z_k (t)-x_*\|)$ for the gradient-push ($\alpha_0 = 0.0297$). Right : The graphs of $\log_{10}(\sum_{k=1}^n \|z_k (t)-x_*\|)$ for the Push-DIGing ($\alpha_1 = 0.001175$).
  • Figure 3: The graph of $\mathcal{L}_{\alpha}$ with respect to $\alpha \in (0,2\alpha_0]$. Left: Case 1. Right: Case 2.
  • Figure 4: Left: The graph of $\log_{10}\Vert w_{\alpha}(t) - w^\alpha \Vert_{\pi\otimes1_d}$ for least square problem with stepsize $\alpha$ = $\alpha_0$, itertation $t=1,\cdots,1000$. Right: The graph of the error $\Vert w^\alpha-n\pi \otimes x_* \Vert_{\pi\otimes 1_d}$ for stepsizes $\alpha \in (0,\alpha_0]$.
  • Figure 5: Both graphs show the $\log_{10}\|w_{\alpha}(t) - n\pi \otimes x_*\|_{\pi \otimes 1_d}$ for regularized least square problem with various stepsizes. Left: Cases of the convergent results. Right: Case of the divergent result.
  • ...and 2 more figures

Theorems & Definitions (38)

  • Definition 1.1
  • Theorem 1.2
  • Definition 1.3
  • Theorem 2.1: CKY2
  • Remark 2.2
  • Theorem 2.3
  • Theorem 2.4
  • Theorem 2.5
  • Theorem 2.6
  • Lemma 3.1
  • ...and 28 more