A Unified Zeroth-Order Optimization Framework via Oblivious Randomized Sketching

Haishan Ye; Xiangyu Chang; Xi Chen

A Unified Zeroth-Order Optimization Framework via Oblivious Randomized Sketching

Haishan Ye, Xiangyu Chang, Xi Chen

TL;DR

This work develops a unified zeroth-order optimization framework using oblivious randomized sketching to unify and improve gradient estimation methods such as FD and RFD. By treating gradient estimation through a sketching matrix $S$, the authors achieve variance reduction in RFD and establish high-probability convergence with weak dimension dependence, notably a query complexity scaling with $\mathrm{tr}(A)/\mu$ rather than $d$. The framework extends to Hessian-aware settings via sketched preconditioners and to general $L$-smooth, $\mu$-strongly convex objectives with Lipschitz Hessians, enabling improved complexity when the Hessian traces are small. A practical trace-estimation scheme allows automatic step-size selection and further improves adaptivity. Comprehensive experiments on synthetic quadratic problems and real-world logistic-regression datasets validate the dimensionality-robust performance and the effectiveness of the trace-based Hessian insight for large-scale black-box optimization.

Abstract

We propose a new framework for analyzing zeroth-order optimization (ZOO) from the perspective of \emph{oblivious randomized sketching}.In this framework, commonly used gradient estimators in ZOO-such as finite difference (FD) and random finite difference (RFD)-are unified through a general sketch-based formulation. By introducing the concept of oblivious randomized sketching, we show that properly chosen sketch matrices can significantly reduce the high variance of RFD estimates and enable \emph{high-probability} convergence guarantees of ZOO, which are rarely available in existing RFD analyses. \noindent We instantiate the framework on convex quadratic objectives and derive a query complexity of $\tilde{\mathcal{O}}(\mathrm{tr}(A)/L \cdot L/μ\log\frac{1}ε)$ to achieve a $ε$-suboptimal solution, where $A$ is the Hessian, $L$ is the largest eigenvalue of $A$, and $μ$ denotes the strong convexity parameter. This complexity can be substantially smaller than the standard query complexity of ${\cO}(d\cdot L/μ\log\frac{1}ε)$ that is linearly dependent on problem dimensionality, especially when $A$ has rapidly decaying eigenvalues. These advantages naturally extend to more general settings, including strongly convex and Hessian-aware optimization. \noindent Overall, this work offers a novel sketch-based perspective on ZOO that explains why and when RFD-type methods can achieve \emph{weakly dimension-independent} convergence in general smooth problems, providing both theoretical foundations and practical implications for ZOO.

A Unified Zeroth-Order Optimization Framework via Oblivious Randomized Sketching

TL;DR

, the authors achieve variance reduction in RFD and establish high-probability convergence with weak dimension dependence, notably a query complexity scaling with

rather than

. The framework extends to Hessian-aware settings via sketched preconditioners and to general

-smooth,

-strongly convex objectives with Lipschitz Hessians, enabling improved complexity when the Hessian traces are small. A practical trace-estimation scheme allows automatic step-size selection and further improves adaptivity. Comprehensive experiments on synthetic quadratic problems and real-world logistic-regression datasets validate the dimensionality-robust performance and the effectiveness of the trace-based Hessian insight for large-scale black-box optimization.

Abstract

to achieve a

-suboptimal solution, where

is the Hessian,

is the largest eigenvalue of

, and

denotes the strong convexity parameter. This complexity can be substantially smaller than the standard query complexity of

that is linearly dependent on problem dimensionality, especially when

has rapidly decaying eigenvalues. These advantages naturally extend to more general settings, including strongly convex and Hessian-aware optimization. \noindent Overall, this work offers a novel sketch-based perspective on ZOO that explains why and when RFD-type methods can achieve \emph{weakly dimension-independent} convergence in general smooth problems, providing both theoretical foundations and practical implications for ZOO.

A Unified Zeroth-Order Optimization Framework via Oblivious Randomized Sketching

TL;DR

Abstract

A Unified Zeroth-Order Optimization Framework via Oblivious Randomized Sketching

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (27)