Accelerated Zero-Order SGD Method for Solving the Black Box Optimization Problem under "Overparametrization" Condition

Aleksandr Lobanov; Alexander Gasnikov

Accelerated Zero-Order SGD Method for Solving the Black Box Optimization Problem under "Overparametrization" Condition

Aleksandr Lobanov, Alexander Gasnikov

TL;DR

A novel gradient-free algorithm is provided, whose creation approach is based on applying a gradient approximation with $l_2$ randomization instead of a gradient oracle in the biased Accelerated SGD algorithm, which generalizes the convergence results of the AC-SA algorithm to the case where the gradient oracles returns a noisy (inexact) objective function value.

Abstract

This paper is devoted to solving a convex stochastic optimization problem in a overparameterization setup for the case where the original gradient computation is not available, but an objective function value can be computed. For this class of problems we provide a novel gradient-free algorithm, whose creation approach is based on applying a gradient approximation with $l_2$ randomization instead of a gradient oracle in the biased Accelerated SGD algorithm, which generalizes the convergence results of the AC-SA algorithm to the case where the gradient oracle returns a noisy (inexact) objective function value. We also perform a detailed analysis to find the maximum admissible level of adversarial noise at which we can guarantee to achieve the desired accuracy. We verify the theoretical results of convergence using a model example.

Accelerated Zero-Order SGD Method for Solving the Black Box Optimization Problem under "Overparametrization" Condition

TL;DR

A novel gradient-free algorithm is provided, whose creation approach is based on applying a gradient approximation with

Abstract

randomization instead of a gradient oracle in the biased Accelerated SGD algorithm, which generalizes the convergence results of the AC-SA algorithm to the case where the gradient oracle returns a noisy (inexact) objective function value. We also perform a detailed analysis to find the maximum admissible level of adversarial noise at which we can guarantee to achieve the desired accuracy. We verify the theoretical results of convergence using a model example.

Paper Structure (22 sections, 4 theorems, 48 equations, 1 figure, 2 algorithms)

This paper contains 22 sections, 4 theorems, 48 equations, 1 figure, 2 algorithms.

Introduction
Related works
Adversarial noise.
SGD type algorithms.
Gradient noise assumptions.
Notations
Paper Ogranization
Technical Preliminaries
Assumptions on the Objective Function
Assumptions on the Gradient Oracle
Accelerated SGD with Biased Gradient
Main Result
Experiments
Conclusion
Auxiliary Facts and Results
...and 7 more sections

Key Result

theorem 1

Let $f$ satisfy Assumptions ass:convex--ass:f_star and gradient oracle from Definition def:biased_oracle satisfy Assumption ass:stoch_noise, then Biased AC-SA algorithm guarantees the convergence with a universal constant $c$

Figures (1)

Figure 1: Convergence of the Accelerated Zero-Order Stochastic Gradient Descent Method and the effect of parameter $B$ (batch size) on the iteration complexity.

Theorems & Definitions (7)

definition 1: Gradient Oracle
theorem 1: Convergence of Biased AC-SA
theorem 2
remark 1: General case
lemma 1: see Lemma 1, Woodworth_2021_over
lemma 2
proof

Accelerated Zero-Order SGD Method for Solving the Black Box Optimization Problem under "Overparametrization" Condition

TL;DR

Abstract

Accelerated Zero-Order SGD Method for Solving the Black Box Optimization Problem under "Overparametrization" Condition

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (7)