Single Point-Based Distributed Zeroth-Order Optimization with a Non-Convex Stochastic Objective Function

Elissa Mhanna; Mohamad Assaad

Single Point-Based Distributed Zeroth-Order Optimization with a Non-Convex Stochastic Objective Function

Elissa Mhanna, Mohamad Assaad

TL;DR

This work introduces a zero-order distributed optimization method based on a one-point estimate of the gradient tracking technique and proves that this new technique converges with a single noisy function query at a time in the non-convex setting.

Abstract

Zero-order (ZO) optimization is a powerful tool for dealing with realistic constraints. On the other hand, the gradient-tracking (GT) technique proved to be an efficient method for distributed optimization aiming to achieve consensus. However, it is a first-order (FO) method that requires knowledge of the gradient, which is not always possible in practice. In this work, we introduce a zero-order distributed optimization method based on a one-point estimate of the gradient tracking technique. We prove that this new technique converges with a single noisy function query at a time in the non-convex setting. We then establish a convergence rate of $O(\frac{1}{\sqrt[3]{K}})$ after a number of iterations K, which competes with that of $O(\frac{1}{\sqrt[4]{K}})$ of its centralized counterparts. Finally, a numerical example validates our theoretical results.

Single Point-Based Distributed Zeroth-Order Optimization with a Non-Convex Stochastic Objective Function

TL;DR

Abstract

after a number of iterations K, which competes with that of

of its centralized counterparts. Finally, a numerical example validates our theoretical results.

Paper Structure (18 sections, 7 theorems, 60 equations, 10 figures, 1 table, 1 algorithm)

This paper contains 18 sections, 7 theorems, 60 equations, 10 figures, 1 table, 1 algorithm.

Introduction
Related Work
Challenges and Contribution
Notation
Problem Assumptions
Algorithm Description
Convergence Result
Convergence Rate
Numerical Example
Simulation Setup
Simulation Results
Conclusion
Estimated gradient
Gradient Estimate Norm Squared Bound
Proof of Convergence
...and 3 more sections

Key Result

Proposition 3.3

1M-A Let Assumptions noise and perturbation hold. Then, $g_{i,k}$ is a biased estimator of the agent's gradient $\nabla F_i(x_{i,k})$, $\forall i\in\mathcal{N}$, and where $b_{i,k}$ denotes the bias with respect to the true gradient. Refer to Appendix gradient for details.

Figures (10)

Figure 1: The evolution of the expected loss function for the digits $6$ and $7$.
Figure 2: The evolution of the consensus error for the digits $6$ and $7$.
Figure 3: The evolution of the consensus error for the digits $6$ and $7$ as compared with the rate $O(\frac{1}{K+1})$.
Figure 4: The evolution of the gradient tracking error for the digits $6$ and $7$.
Figure 5: The evolution of the accuracy for the digits $6$ and $7$.
...and 5 more figures

Theorems & Definitions (8)

Remark 2.1
Proposition 3.3
Lemma 3.4
Lemma 3.5
Lemma 3.6
Theorem 3.7
Theorem 4.1
Lemma B.1

Single Point-Based Distributed Zeroth-Order Optimization with a Non-Convex Stochastic Objective Function

TL;DR

Abstract

Single Point-Based Distributed Zeroth-Order Optimization with a Non-Convex Stochastic Objective Function

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (10)

Theorems & Definitions (8)