Curvature-Aware Derivative-Free Optimization
Bumsu Kim, HanQin Cai, Daniel McKenzie, Wotao Yin
TL;DR
This paper addresses derivative-free optimization (DFO) in high dimensions where gradients are unavailable, introducing Curvature-Aware Random Search (CARS) that uses one-dimensional Newton-style updates along random directions with finite-difference estimates of first and second derivatives to compute a candidate step-size $\alpha_+$.A safeguarding mechanism ensures descent at every iteration, while a cubic-regularized variant (CARS-CR) extends the approach to general convex functions, achieving $\mathcal{O}(k^{-1})$ convergence under standard smoothness assumptions.The authors prove linear convergence in expectation for strongly convex objectives under mild sampling-distribution conditions and characterize the sampling-distribution requirements through $\eta(g,H;\mathcal{D})$ and $p_\gamma$, with concrete lower bounds for common isotropic distributions.Empirically, CARS and CARS-CR outperform state-of-the-art zeroth-order methods on convex and nonconvex benchmarks and demonstrate strong performance in black-box adversarial attacks, supported by open-source implementations.
Abstract
The paper discusses derivative-free optimization (DFO), which involves minimizing a function without access to gradients or directional derivatives, only function evaluations. Classical DFO methods, which mimic gradient-based methods, such as Nelder-Mead and direct search have limited scalability for high-dimensional problems. Zeroth-order methods have been gaining popularity due to the demands of large-scale machine learning applications, and the paper focuses on the selection of the step size $α_k$ in these methods. The proposed approach, called Curvature-Aware Random Search (CARS), uses first- and second-order finite difference approximations to compute a candidate $α_{+}$. We prove that for strongly convex objective functions, CARS converges linearly provided that the search direction is drawn from a distribution satisfying very mild conditions. We also present a Cubic Regularized variant of CARS, named CARS-CR, which converges in a rate of $\mathcal{O}(k^{-1})$ without the assumption of strong convexity. Numerical experiments show that CARS and CARS-CR match or exceed the state-of-the-arts on benchmark problem sets.
