Table of Contents
Fetching ...

On the convergence analysis of the decentralized projected gradient descent method

Woocheol Choi, Jimyeong Kim

TL;DR

This work analyzes the convergence of decentralized projected gradient descent (DPG) for constrained distributed optimization. It develops a novel sequential-estimate framework that leverages the contraction property of projection to bound deviations from the optimum, yielding an $O(\sqrt{\alpha})$-neighborhood for constant stepsizes and $O(t^{-p/2})$-rates for diminishing stepsizes, under standard smoothness and strong convexity assumptions. The authors further improve the bound to $O(\alpha)$ in the half-space domain $\Omega=\mathbb{R}^{d-1}\times\mathbb{R}_+$ with the optimum on the boundary, via a coordinate-splitting analysis. Numerical experiments on non-negative least squares and constrained logistic regression validate the theory and illustrate practical gains from a DPG+P-DIGing hybrid. Collectively, the results advance understanding of constrained decentralized optimization, guiding step-size choices and domain-aware convergence guarantees.

Abstract

In this work, we are concerned with the decentralized optimization problem: \begin{equation*} \min_{x \in Ω}~f(x) = \frac{1}{n} \sum_{i=1}^n f_i (x), \end{equation*} where $Ω\subset \mathbb{R}^d$ is a convex domain and each $f_i : Ω\rightarrow \mathbb{R}$ is a local cost function only known to agent $i$. A fundamental algorithm is the decentralized projected gradient method (DPG) given by \begin{equation*} x_i(t+1)=\mathcal{P}_Ω\Big[\sum^n_{j=1}w_{ij} x_j(t) -α(t)\nabla f_i(x_i(t))\Big] \end{equation*} where $\mathcal{P}_Ω$ is the projection operator to $Ω$ and $ \{w_{ij}\}_{1\leq i,j \leq n}$ are communication weight among the agents. While this method has been widely used in the literature, its convergence property has not been established so far, except for the special case $Ω= \mathbb{R}^n$. This work establishes new convergence estimates of DPG when the aggregate cost $f$ is strongly convex and each function $f_i$ is smooth. If the stepsize is given by constant $α(t) \equivα>0$ and suitably small, we prove that each $x_i (t)$ converges to an $O(\sqrtα)$-neighborhood of the optimal point. In addition, we further improve the convergence result by showing that the point $x_i (t)$ converges to an $O(α)$-neighborhood of the optimal point if the domain is given the half-space $\mathbb{R}^{d-1}\times \mathbb{R}_{+}$ for any dimension $d\in \mathbb{N}$. Also, we obtain new convergence results for decreasing stepsizes. Numerical experiments are provided to support the convergence results.

On the convergence analysis of the decentralized projected gradient descent method

TL;DR

This work analyzes the convergence of decentralized projected gradient descent (DPG) for constrained distributed optimization. It develops a novel sequential-estimate framework that leverages the contraction property of projection to bound deviations from the optimum, yielding an -neighborhood for constant stepsizes and -rates for diminishing stepsizes, under standard smoothness and strong convexity assumptions. The authors further improve the bound to in the half-space domain with the optimum on the boundary, via a coordinate-splitting analysis. Numerical experiments on non-negative least squares and constrained logistic regression validate the theory and illustrate practical gains from a DPG+P-DIGing hybrid. Collectively, the results advance understanding of constrained decentralized optimization, guiding step-size choices and domain-aware convergence guarantees.

Abstract

In this work, we are concerned with the decentralized optimization problem: \begin{equation*} \min_{x \in Ω}~f(x) = \frac{1}{n} \sum_{i=1}^n f_i (x), \end{equation*} where is a convex domain and each is a local cost function only known to agent . A fundamental algorithm is the decentralized projected gradient method (DPG) given by \begin{equation*} x_i(t+1)=\mathcal{P}_Ω\Big[\sum^n_{j=1}w_{ij} x_j(t) -α(t)\nabla f_i(x_i(t))\Big] \end{equation*} where is the projection operator to and are communication weight among the agents. While this method has been widely used in the literature, its convergence property has not been established so far, except for the special case . This work establishes new convergence estimates of DPG when the aggregate cost is strongly convex and each function is smooth. If the stepsize is given by constant and suitably small, we prove that each converges to an -neighborhood of the optimal point. In addition, we further improve the convergence result by showing that the point converges to an -neighborhood of the optimal point if the domain is given the half-space for any dimension . Also, we obtain new convergence results for decreasing stepsizes. Numerical experiments are provided to support the convergence results.
Paper Structure (28 sections, 22 theorems, 198 equations, 6 figures, 1 table)

This paper contains 28 sections, 22 theorems, 198 equations, 6 figures, 1 table.

Key Result

Theorem 2.5

\newlabelthm-2-110 There exists a constant $R_s >0$ such that holds for all $t\geq 0$ if at least one of the following statements holds true:

Figures (6)

  • Figure 1: The consequences of Proposition \ref{['prop-3-1']} and Proposition \ref{['prop-3-2']}.
  • Figure 1: The graphs of $\log R(t)$ under various choices of constant stepsizes (left), diminishing stepsizes (right).
  • Figure 1: The overall flows of the proofs for Proposition \ref{['prop-3-1']} and Proposition \ref{['prop-3-2']}.
  • Figure 2: The grphs of $\log R(t)$ with P-DIGing, DPG using a constant step size, DPG+P-DIGing, and DPG with a diminishing step size.
  • Figure 3: The graph of $\log R(t)$ with P-DIGing, DPG using a constant step size, DPG+P-DIGing.
  • ...and 1 more figures

Theorems & Definitions (50)

  • Theorem 2.5: Conditions for uniform bounedness
  • Theorem 2.7: Consensus
  • Theorem 2.8: Convergence for constant stepsize
  • Remark 2.9
  • Theorem 2.11
  • Theorem 2.12: Convergence for diminishing stepsize
  • Remark 2.13
  • Remark 2.14
  • Remark 2.15
  • Proposition 3.1
  • ...and 40 more