Table of Contents
Fetching ...

Accelerated Zero-Order SGD Method for Solving the Black Box Optimization Problem under "Overparametrization" Condition

Aleksandr Lobanov, Alexander Gasnikov

TL;DR

A novel gradient-free algorithm is provided, whose creation approach is based on applying a gradient approximation with $l_2$ randomization instead of a gradient oracle in the biased Accelerated SGD algorithm, which generalizes the convergence results of the AC-SA algorithm to the case where the gradient oracles returns a noisy (inexact) objective function value.

Abstract

This paper is devoted to solving a convex stochastic optimization problem in a overparameterization setup for the case where the original gradient computation is not available, but an objective function value can be computed. For this class of problems we provide a novel gradient-free algorithm, whose creation approach is based on applying a gradient approximation with $l_2$ randomization instead of a gradient oracle in the biased Accelerated SGD algorithm, which generalizes the convergence results of the AC-SA algorithm to the case where the gradient oracle returns a noisy (inexact) objective function value. We also perform a detailed analysis to find the maximum admissible level of adversarial noise at which we can guarantee to achieve the desired accuracy. We verify the theoretical results of convergence using a model example.

Accelerated Zero-Order SGD Method for Solving the Black Box Optimization Problem under "Overparametrization" Condition

TL;DR

A novel gradient-free algorithm is provided, whose creation approach is based on applying a gradient approximation with randomization instead of a gradient oracle in the biased Accelerated SGD algorithm, which generalizes the convergence results of the AC-SA algorithm to the case where the gradient oracles returns a noisy (inexact) objective function value.

Abstract

This paper is devoted to solving a convex stochastic optimization problem in a overparameterization setup for the case where the original gradient computation is not available, but an objective function value can be computed. For this class of problems we provide a novel gradient-free algorithm, whose creation approach is based on applying a gradient approximation with randomization instead of a gradient oracle in the biased Accelerated SGD algorithm, which generalizes the convergence results of the AC-SA algorithm to the case where the gradient oracle returns a noisy (inexact) objective function value. We also perform a detailed analysis to find the maximum admissible level of adversarial noise at which we can guarantee to achieve the desired accuracy. We verify the theoretical results of convergence using a model example.
Paper Structure (22 sections, 4 theorems, 48 equations, 1 figure, 2 algorithms)

This paper contains 22 sections, 4 theorems, 48 equations, 1 figure, 2 algorithms.

Key Result

theorem 1

Let $f$ satisfy Assumptions ass:convex--ass:f_star and gradient oracle from Definition def:biased_oracle satisfy Assumption ass:stoch_noise, then Biased AC-SA algorithm guarantees the convergence with a universal constant $c$

Figures (1)

  • Figure 1: Convergence of the Accelerated Zero-Order Stochastic Gradient Descent Method and the effect of parameter $B$ (batch size) on the iteration complexity.

Theorems & Definitions (7)

  • definition 1: Gradient Oracle
  • theorem 1: Convergence of Biased AC-SA
  • theorem 2
  • remark 1: General case
  • lemma 1: see Lemma 1, Woodworth_2021_over
  • lemma 2
  • proof