Table of Contents
Fetching ...

Single Point-Based Distributed Zeroth-Order Optimization with a Non-Convex Stochastic Objective Function

Elissa Mhanna, Mohamad Assaad

TL;DR

This work introduces a zero-order distributed optimization method based on a one-point estimate of the gradient tracking technique and proves that this new technique converges with a single noisy function query at a time in the non-convex setting.

Abstract

Zero-order (ZO) optimization is a powerful tool for dealing with realistic constraints. On the other hand, the gradient-tracking (GT) technique proved to be an efficient method for distributed optimization aiming to achieve consensus. However, it is a first-order (FO) method that requires knowledge of the gradient, which is not always possible in practice. In this work, we introduce a zero-order distributed optimization method based on a one-point estimate of the gradient tracking technique. We prove that this new technique converges with a single noisy function query at a time in the non-convex setting. We then establish a convergence rate of $O(\frac{1}{\sqrt[3]{K}})$ after a number of iterations K, which competes with that of $O(\frac{1}{\sqrt[4]{K}})$ of its centralized counterparts. Finally, a numerical example validates our theoretical results.

Single Point-Based Distributed Zeroth-Order Optimization with a Non-Convex Stochastic Objective Function

TL;DR

This work introduces a zero-order distributed optimization method based on a one-point estimate of the gradient tracking technique and proves that this new technique converges with a single noisy function query at a time in the non-convex setting.

Abstract

Zero-order (ZO) optimization is a powerful tool for dealing with realistic constraints. On the other hand, the gradient-tracking (GT) technique proved to be an efficient method for distributed optimization aiming to achieve consensus. However, it is a first-order (FO) method that requires knowledge of the gradient, which is not always possible in practice. In this work, we introduce a zero-order distributed optimization method based on a one-point estimate of the gradient tracking technique. We prove that this new technique converges with a single noisy function query at a time in the non-convex setting. We then establish a convergence rate of after a number of iterations K, which competes with that of of its centralized counterparts. Finally, a numerical example validates our theoretical results.
Paper Structure (18 sections, 7 theorems, 60 equations, 10 figures, 1 table, 1 algorithm)

This paper contains 18 sections, 7 theorems, 60 equations, 10 figures, 1 table, 1 algorithm.

Key Result

Proposition 3.3

1M-A Let Assumptions noise and perturbation hold. Then, $g_{i,k}$ is a biased estimator of the agent's gradient $\nabla F_i(x_{i,k})$, $\forall i\in\mathcal{N}$, and where $b_{i,k}$ denotes the bias with respect to the true gradient. Refer to Appendix gradient for details.

Figures (10)

  • Figure 1: The evolution of the expected loss function for the digits $6$ and $7$.
  • Figure 2: The evolution of the consensus error for the digits $6$ and $7$.
  • Figure 3: The evolution of the consensus error for the digits $6$ and $7$ as compared with the rate $O(\frac{1}{K+1})$.
  • Figure 4: The evolution of the gradient tracking error for the digits $6$ and $7$.
  • Figure 5: The evolution of the accuracy for the digits $6$ and $7$.
  • ...and 5 more figures

Theorems & Definitions (8)

  • Remark 2.1
  • Proposition 3.3
  • Lemma 3.4
  • Lemma 3.5
  • Lemma 3.6
  • Theorem 3.7
  • Theorem 4.1
  • Lemma B.1