Table of Contents
Fetching ...

A Zeroth-Order Proximal Algorithm for Consensus Optimization

Chengan Wang, Zichong Ou, Jie Lu

TL;DR

The paper tackles distributed consensus optimization where each node only has zeroth-order access to its local objective, and the global objective is the sum of all locals. It introduces ZoPro, a zeroth-order proximal algorithm that uses Gaussian smoothing to form unbiased estimates of the gradient $\nabla f_\mu$ and Hessian $\nabla^2 f_\mu$ and plugs them into a distributed second-order proximal update with a backtracking Armijo step size. Under $m_i$-strong convexity and $M_i$-smoothness, and suitable parameter choices, ZoPro converges linearly in expectation to a neighborhood of the optimum, with the neighborhood controlled by the smoothing parameter $\mu$ and batch size $b$. Numerical experiments show ZoPro outperforms several zeroth-order baselines and is faster than some second-order methods on large-scale problems, highlighting its practical efficiency and scalability in decentralized settings where derivatives are unavailable.

Abstract

This paper considers a consensus optimization problem, where all the nodes in a network, with access to the zeroth-order information of its local objective function only, attempt to cooperatively achieve a common minimizer of the sum of their local objectives. To address this problem, we develop ZoPro, a zeroth-order proximal algorithm, which incorporates a zeroth-order oracle for approximating Hessian and gradient into a recently proposed, high-performance distributed second-order proximal algorithm. We show that the proposed ZoPro algorithm, equipped with a dynamic stepsize, converges linearly to a neighborhood of the optimum in expectation, provided that each local objective function is strongly convex and smooth. Extensive simulations demonstrate that ZoPro converges faster than several state-of-the-art distributed zeroth-order algorithms and outperforms a few distributed second-order algorithms in terms of running time for reaching given accuracy.

A Zeroth-Order Proximal Algorithm for Consensus Optimization

TL;DR

The paper tackles distributed consensus optimization where each node only has zeroth-order access to its local objective, and the global objective is the sum of all locals. It introduces ZoPro, a zeroth-order proximal algorithm that uses Gaussian smoothing to form unbiased estimates of the gradient and Hessian and plugs them into a distributed second-order proximal update with a backtracking Armijo step size. Under -strong convexity and -smoothness, and suitable parameter choices, ZoPro converges linearly in expectation to a neighborhood of the optimum, with the neighborhood controlled by the smoothing parameter and batch size . Numerical experiments show ZoPro outperforms several zeroth-order baselines and is faster than some second-order methods on large-scale problems, highlighting its practical efficiency and scalability in decentralized settings where derivatives are unavailable.

Abstract

This paper considers a consensus optimization problem, where all the nodes in a network, with access to the zeroth-order information of its local objective function only, attempt to cooperatively achieve a common minimizer of the sum of their local objectives. To address this problem, we develop ZoPro, a zeroth-order proximal algorithm, which incorporates a zeroth-order oracle for approximating Hessian and gradient into a recently proposed, high-performance distributed second-order proximal algorithm. We show that the proposed ZoPro algorithm, equipped with a dynamic stepsize, converges linearly to a neighborhood of the optimum in expectation, provided that each local objective function is strongly convex and smooth. Extensive simulations demonstrate that ZoPro converges faster than several state-of-the-art distributed zeroth-order algorithms and outperforms a few distributed second-order algorithms in terms of running time for reaching given accuracy.
Paper Structure (14 sections, 3 theorems, 48 equations, 3 figures, 1 algorithm)

This paper contains 14 sections, 3 theorems, 48 equations, 3 figures, 1 algorithm.

Key Result

Proposition 1

Let $f_i:\mathbb{R}^{d}\rightarrow\mathbb{R}$ be a L-smooth function and let $\left\{x_i^{k}\right\}$ be the sequence generated by $x_i^{k+1}=x_i^{k}+\alpha_i^{k}d_i^{k}$, where $\alpha_i^{k}$ is the stepsize determined by the backtracking line search and $d_i^k$ is the corresponding search directio

Figures (3)

  • Figure 1: Convergence performance of ZoPro, ZOPD and ZOGT
  • Figure 2: Convergence performance of ZoPro, ESOM, DQM and SoPro for the medium-scale problem
  • Figure 3: Convergence performance of ZoPro, ESOM, DQM and SoPro for the large-scale problem

Theorems & Definitions (6)

  • Proposition 1
  • proof
  • Theorem 1
  • proof
  • Lemma 1
  • proof