Table of Contents
Fetching ...

Bregman Linearized Augmented Lagrangian Method for Nonconvex Constrained Stochastic Zeroth-order Optimization

Qiankun Shi, Xiao Wang, Hao Wang

TL;DR

This work addresses nonconvex constrained stochastic zeroth-order optimization with exact constraints and noisy objective evaluations. It introduces a single-loop Bregman linearized augmented Lagrangian method that uses a two-point zeroth-order gradient estimator with variance reduction, and analyzes the oracle complexity to achieve an $\varepsilon$-KKT point. Key findings show dimension-dependent improvements: the complexity scales as $O(p d^{2/p} \varepsilon^{-3})$ for $p\in[2,2\ln d]$ and as $O(\ln d \varepsilon^{-3})$ for $p>2\ln d$ under Rademacher smoothing, matching the best known $\varepsilon$-rates while reducing the $d$-dependence. Numerical experiments on constrained Lasso and black-box adversarial attacks validate the approach and demonstrate practical efficiency gains over existing zeroth-order methods.

Abstract

In this paper, we study nonconvex constrained stochastic zeroth-order optimization problems, for which we have access to exact information of constraints and noisy function values of the objective. We propose a Bregman linearized augmented Lagrangian method that utilizes stochastic zeroth-order gradient estimators combined with a variance reduction technique. We analyze its oracle complexity, in terms of the total number of stochastic function value evaluations required to achieve an \(ε\)-KKT point in \(\ell_p\)-norm metrics with \(p \ge 2\), where \(p\) is a parameter associated with the selected Bregman distance. In particular, starting from a near-feasible initial point and using Rademacher smoothing, the oracle complexity is in order \(O(p d^{2/p} ε^{-3})\) for \(p \in [2, 2 \ln d]\), and \(O(\ln d \cdot ε^{-3})\) for \(p > 2 \ln d\), where \(d\) denotes the problem dimension. Those results show that the complexity of the proposed method can achieve a dimensional dependency lower than \(O(d)\) without requiring additional assumptions, provided that a Bregman distance is chosen properly. This offers a significant improvement in the high-dimensional setting over existing work, and matches the lowest complexity order with respect to the tolerance \(ε\) reported in the literature. Numerical experiments on constrained Lasso and black-box adversarial attack problems highlight the promising performances of the proposed method.

Bregman Linearized Augmented Lagrangian Method for Nonconvex Constrained Stochastic Zeroth-order Optimization

TL;DR

This work addresses nonconvex constrained stochastic zeroth-order optimization with exact constraints and noisy objective evaluations. It introduces a single-loop Bregman linearized augmented Lagrangian method that uses a two-point zeroth-order gradient estimator with variance reduction, and analyzes the oracle complexity to achieve an -KKT point. Key findings show dimension-dependent improvements: the complexity scales as for and as for under Rademacher smoothing, matching the best known -rates while reducing the -dependence. Numerical experiments on constrained Lasso and black-box adversarial attacks validate the approach and demonstrate practical efficiency gains over existing zeroth-order methods.

Abstract

In this paper, we study nonconvex constrained stochastic zeroth-order optimization problems, for which we have access to exact information of constraints and noisy function values of the objective. We propose a Bregman linearized augmented Lagrangian method that utilizes stochastic zeroth-order gradient estimators combined with a variance reduction technique. We analyze its oracle complexity, in terms of the total number of stochastic function value evaluations required to achieve an -KKT point in -norm metrics with , where is a parameter associated with the selected Bregman distance. In particular, starting from a near-feasible initial point and using Rademacher smoothing, the oracle complexity is in order \(O(p d^{2/p} ε^{-3})\) for , and \(O(\ln d \cdot ε^{-3})\) for , where denotes the problem dimension. Those results show that the complexity of the proposed method can achieve a dimensional dependency lower than \(O(d)\) without requiring additional assumptions, provided that a Bregman distance is chosen properly. This offers a significant improvement in the high-dimensional setting over existing work, and matches the lowest complexity order with respect to the tolerance reported in the literature. Numerical experiments on constrained Lasso and black-box adversarial attack problems highlight the promising performances of the proposed method.

Paper Structure

This paper contains 16 sections, 15 theorems, 70 equations, 4 figures, 1 table, 1 algorithm.

Key Result

Lemma 1

Under Assumptions ass:ms-p and ass:exs, it holds that for any $x \in X$,

Figures (4)

  • Figure 1: Comparison of different $q$ on low-dimensional constrained Lasso problems
  • Figure 2: Comparison of different $q$ on high-dimensional constrained Lasso problems
  • Figure 3: A black-box targeted attack example with our method
  • Figure 4: A black-box untargeted attack example with our method

Theorems & Definitions (31)

  • Lemma 1
  • proof
  • Lemma 2
  • proof
  • Lemma 3
  • proof
  • Lemma 4
  • proof
  • Definition 1
  • Lemma 5
  • ...and 21 more