Fully Zeroth-Order Bilevel Programming via Gaussian Smoothing
Alireza Aghasi, Saeed Ghadimi
TL;DR
The paper addresses stochastic bilevel optimization when neither objective nor gradient information is available in closed form. It develops a fully zeroth-order framework by extending Gaussian smoothing via Stein's identity to functions with two independent variable blocks, providing gradient and Hessian estimators based solely on function evaluations. A two-loop algorithm (ZDSBA) combines inner zeroth-order SGD for the lower problem with outer zeroth-order steps for the upper problem, backed by a zeroth-order Hessian-inverse routine (SZHIA); the authors prove non-asymptotic convergence and derive explicit sample complexity bounds for various convexity regimes. This work enables practical zeroth-order solutions for large-scale bilevel problems and offers foundational tools for derivative-free optimization in hierarchical learning and decision-making tasks, while highlighting open questions on dimensionality dependence and potential improvements when first-order information becomes available.
Abstract
In this paper, we study and analyze zeroth-order stochastic approximation algorithms for solving bilvel problems, when neither the upper/lower objective values, nor their unbiased gradient estimates are available. In particular, exploiting Stein's identity, we first use Gaussian smoothing to estimate first- and second-order partial derivatives of functions with two independent block of variables. We then used these estimates in the framework of a stochastic approximation algorithm for solving bilevel optimization problems and establish its non-asymptotic convergence analysis. To the best of our knowledge, this is the first time that sample complexity bounds are established for a fully stochastic zeroth-order bilevel optimization algorithm.
