A Zeroth-Order Proximal Algorithm for Consensus Optimization

Chengan Wang; Zichong Ou; Jie Lu

A Zeroth-Order Proximal Algorithm for Consensus Optimization

Chengan Wang, Zichong Ou, Jie Lu

TL;DR

The paper tackles distributed consensus optimization where each node only has zeroth-order access to its local objective, and the global objective is the sum of all locals. It introduces ZoPro, a zeroth-order proximal algorithm that uses Gaussian smoothing to form unbiased estimates of the gradient $\nabla f_\mu$ and Hessian $\nabla^2 f_\mu$ and plugs them into a distributed second-order proximal update with a backtracking Armijo step size. Under $m_i$-strong convexity and $M_i$-smoothness, and suitable parameter choices, ZoPro converges linearly in expectation to a neighborhood of the optimum, with the neighborhood controlled by the smoothing parameter $\mu$ and batch size $b$. Numerical experiments show ZoPro outperforms several zeroth-order baselines and is faster than some second-order methods on large-scale problems, highlighting its practical efficiency and scalability in decentralized settings where derivatives are unavailable.

Abstract

This paper considers a consensus optimization problem, where all the nodes in a network, with access to the zeroth-order information of its local objective function only, attempt to cooperatively achieve a common minimizer of the sum of their local objectives. To address this problem, we develop ZoPro, a zeroth-order proximal algorithm, which incorporates a zeroth-order oracle for approximating Hessian and gradient into a recently proposed, high-performance distributed second-order proximal algorithm. We show that the proposed ZoPro algorithm, equipped with a dynamic stepsize, converges linearly to a neighborhood of the optimum in expectation, provided that each local objective function is strongly convex and smooth. Extensive simulations demonstrate that ZoPro converges faster than several state-of-the-art distributed zeroth-order algorithms and outperforms a few distributed second-order algorithms in terms of running time for reaching given accuracy.

A Zeroth-Order Proximal Algorithm for Consensus Optimization

TL;DR

and Hessian

and plugs them into a distributed second-order proximal update with a backtracking Armijo step size. Under

-strong convexity and

-smoothness, and suitable parameter choices, ZoPro converges linearly in expectation to a neighborhood of the optimum, with the neighborhood controlled by the smoothing parameter

and batch size

. Numerical experiments show ZoPro outperforms several zeroth-order baselines and is faster than some second-order methods on large-scale problems, highlighting its practical efficiency and scalability in decentralized settings where derivatives are unavailable.

Abstract

Paper Structure (14 sections, 3 theorems, 48 equations, 3 figures, 1 algorithm)

This paper contains 14 sections, 3 theorems, 48 equations, 3 figures, 1 algorithm.

INTRODUCTION
PROBLEM FORMULATION
ALGORITHM DEVELOPMENT
SoPro Algorithm
Zeroth-order Oracle
ZoPro Algorithm
CONVERGENCE ANALYSIS
NUMERICAL EXPERIMENTS
Comparison with Zeroth-order Methods
Comparison with Second-order Methods
CONCLUSION
APPENDIX
Proof of Proposition \ref{['prop1']}
Proof of Theorem \ref{['theorem1']}

Key Result

Proposition 1

Let $f_i:\mathbb{R}^{d}\rightarrow\mathbb{R}$ be a L-smooth function and let $\left\{x_i^{k}\right\}$ be the sequence generated by $x_i^{k+1}=x_i^{k}+\alpha_i^{k}d_i^{k}$, where $\alpha_i^{k}$ is the stepsize determined by the backtracking line search and $d_i^k$ is the corresponding search directio

Figures (3)

Figure 1: Convergence performance of ZoPro, ZOPD and ZOGT
Figure 2: Convergence performance of ZoPro, ESOM, DQM and SoPro for the medium-scale problem
Figure 3: Convergence performance of ZoPro, ESOM, DQM and SoPro for the large-scale problem

Theorems & Definitions (6)

Proposition 1
proof
Theorem 1
proof
Lemma 1
proof

A Zeroth-Order Proximal Algorithm for Consensus Optimization

TL;DR

Abstract

A Zeroth-Order Proximal Algorithm for Consensus Optimization

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (6)