Power Homotopy for Zeroth-Order Non-Convex Optimizations

Chen Xu

Power Homotopy for Zeroth-Order Non-Convex Optimizations

Chen Xu

TL;DR

GS-PowerHP presents a zeroth-order non-convex optimization method that combines a power-transformed Gaussian-smoothed surrogate $F_{N,\sigma}(\mu)=\mathbb{E}_{x\sim\mathcal{N}(\mu,\sigma^2 I_d)}[e^{N f(x)}]$ with an incrementally decaying smoothing radius $\sigma$. The algorithm operates in a single loop, updating $\mu$ via a stochastic gradient estimate and progressively reducing $\sigma$, and it provides convergence guarantees to a neighborhood of the global maximizer $x^*$ with a complexity of $O(d^2 \varepsilon^{-2})$. Theoretical results show that, for sufficiently large power $N$, stationary points of the surrogate concentrate near $x^*$, enabling convergence of $\mu_t$ to $\mathcal{S}_{x^*,\delta}$. Empirically, GS-PowerHP outperforms competing smoothing-based and evolutionary zeroth-order methods across optimization benchmarks and achieves strong results on high-dimensional black-box image attacks on ImageNet, illustrating robustness and scalability to very large $d$.

Abstract

We introduce GS-PowerHP, a novel zeroth-order method for non-convex optimization problems of the form $\max_{x \in \mathbb{R}^d} f(x)$. Our approach leverages two key components: a power-transformed Gaussian-smoothed surrogate $F_{N,σ}(μ) = \mathbb{E}_{x\sim\mathcal{N}(μ,σ^2 I_d)}[e^{N f(x)}]$ whose stationary points cluster near the global maximizer $x^*$ of $f$ for sufficiently large $N$, and an incrementally decaying $σ$ for enhanced data efficiency. Under mild assumptions, we prove convergence in expectation to a small neighborhood of $x^*$ with the iteration complexity of $O(d^2 \varepsilon^{-2})$. Empirical results show our approach consistently ranks among the top three across a suite of competing algorithms. Its robustness is underscored by the final experiment on a substantially high-dimensional problem ($d=150,528$), where it achieved first place on least-likely targeted black-box attacks against images from ImageNet, surpassing all competing methods.

Power Homotopy for Zeroth-Order Non-Convex Optimizations

TL;DR

GS-PowerHP presents a zeroth-order non-convex optimization method that combines a power-transformed Gaussian-smoothed surrogate

with an incrementally decaying smoothing radius

. The algorithm operates in a single loop, updating

via a stochastic gradient estimate and progressively reducing

, and it provides convergence guarantees to a neighborhood of the global maximizer

with a complexity of

. Theoretical results show that, for sufficiently large power

, stationary points of the surrogate concentrate near

, enabling convergence of

. Empirically, GS-PowerHP outperforms competing smoothing-based and evolutionary zeroth-order methods across optimization benchmarks and achieves strong results on high-dimensional black-box image attacks on ImageNet, illustrating robustness and scalability to very large

Abstract

We introduce GS-PowerHP, a novel zeroth-order method for non-convex optimization problems of the form

. Our approach leverages two key components: a power-transformed Gaussian-smoothed surrogate

whose stationary points cluster near the global maximizer

for sufficiently large

, and an incrementally decaying

for enhanced data efficiency. Under mild assumptions, we prove convergence in expectation to a small neighborhood of

with the iteration complexity of

. Empirical results show our approach consistently ranks among the top three across a suite of competing algorithms. Its robustness is underscored by the final experiment on a substantially high-dimensional problem (

), where it achieved first place on least-likely targeted black-box attacks against images from ImageNet, surpassing all competing methods.

Power Homotopy for Zeroth-Order Non-Convex Optimizations

TL;DR

Abstract

Power Homotopy for Zeroth-Order Non-Convex Optimizations

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (15)