Table of Contents
Fetching ...

Stochastic Regret Guarantees for Online Zeroth- and First-Order Bilevel Optimization

Parvin Nazari, Bojian Hou, Davoud Ataee Tarzanagh, Li Shen, George Michailidis

TL;DR

This work advances online bilevel optimization (OBO) by removing the need for window-smoothed regret and providing sublinear stochastic bilevel regret guarantees for both first-order and zeroth-order settings. It introduces a novel momentum-like search direction and a simultaneous online gradient descent (SOGD) framework that updates the leader, follower, and auxiliary variables in a single loop, using Hessian-vector and Jacobian-vector products without full inner problem solves. In the zeroth-order regime, the paper leverages Gaussian smoothing and finite-difference estimators to construct hypergradient surrogates based on function-value feedback, achieving dimension-dependent but sublinear regret bounds. Theoretical results are complemented by experiments on online parametric loss tuning and black-box adversarial attacks, demonstrating practical efficiency and robustness under limited feedback. Overall, the approach broadens the applicability of online bilevel optimization to large-scale and black-box settings with provable dynamic performance guarantees.

Abstract

Online bilevel optimization (OBO) is a powerful framework for machine learning problems where both outer and inner objectives evolve over time, requiring dynamic updates. Current OBO approaches rely on deterministic \textit{window-smoothed} regret minimization, which may not accurately reflect system performance when functions change rapidly. In this work, we introduce a novel search direction and show that both first- and zeroth-order (ZO) stochastic OBO algorithms leveraging this direction achieve sublinear {stochastic bilevel regret without window smoothing}. Beyond these guarantees, our framework enhances efficiency by: (i) reducing oracle dependence in hypergradient estimation, (ii) updating inner and outer variables alongside the linear system solution, and (iii) employing ZO-based estimation of Hessians, Jacobians, and gradients. Experiments on online parametric loss tuning and black-box adversarial attacks validate our approach.

Stochastic Regret Guarantees for Online Zeroth- and First-Order Bilevel Optimization

TL;DR

This work advances online bilevel optimization (OBO) by removing the need for window-smoothed regret and providing sublinear stochastic bilevel regret guarantees for both first-order and zeroth-order settings. It introduces a novel momentum-like search direction and a simultaneous online gradient descent (SOGD) framework that updates the leader, follower, and auxiliary variables in a single loop, using Hessian-vector and Jacobian-vector products without full inner problem solves. In the zeroth-order regime, the paper leverages Gaussian smoothing and finite-difference estimators to construct hypergradient surrogates based on function-value feedback, achieving dimension-dependent but sublinear regret bounds. Theoretical results are complemented by experiments on online parametric loss tuning and black-box adversarial attacks, demonstrating practical efficiency and robustness under limited feedback. Overall, the approach broadens the applicability of online bilevel optimization to large-scale and black-box settings with provable dynamic performance guarantees.

Abstract

Online bilevel optimization (OBO) is a powerful framework for machine learning problems where both outer and inner objectives evolve over time, requiring dynamic updates. Current OBO approaches rely on deterministic \textit{window-smoothed} regret minimization, which may not accurately reflect system performance when functions change rapidly. In this work, we introduce a novel search direction and show that both first- and zeroth-order (ZO) stochastic OBO algorithms leveraging this direction achieve sublinear {stochastic bilevel regret without window smoothing}. Beyond these guarantees, our framework enhances efficiency by: (i) reducing oracle dependence in hypergradient estimation, (ii) updating inner and outer variables alongside the linear system solution, and (iii) employing ZO-based estimation of Hessians, Jacobians, and gradients. Experiments on online parametric loss tuning and black-box adversarial attacks validate our approach.

Paper Structure

This paper contains 27 sections, 42 theorems, 397 equations, 3 figures, 4 tables, 2 algorithms.

Key Result

Lemma 2.1

Let $w = t$, $W=1/\eta$ and $\nu=1-\eta$ for $\eta\in (0,1)$ in the window-smoothed gradient ${\nabla} F_{t,\nu}({\bf{x}}_t, {\bf{y}}_t;\mathcal{B}_t) = \frac{1}{W} \sum_{i=0}^{w-1} \nu^i {\nabla} f_{t-i}({\bf{x}}_{t-i}, {\bf{y}}_{t-i};\mathcal{B}_{t-i})$, where $\mathcal{B}_t := \{\xi_{t,1}, \ldots

Figures (3)

  • Figure 1: Smoothly and rapidly changing $f_t$ in OBO with $g_t(x_t, y_t) = (y_t - \cos(x_t))^2$, $a_t = 1 + 0.5 \sin(t)$, $b_t = 1 + \sin(0.5t)$, and $c_t = 10 b_t$.
  • Figure 2: Performance comparison (mean$\pm$std) of optimizers including ZO-O-GD, ZO-O-Adam, ZO-O-SignSGD, ZO-O-ConservSGD, ZO-SOGD, and ZO-SOGD (Adam) on online adversarial attack for MNIST data across five runs.
  • Figure 3: Performance (mean$\pm$std) on online parametric loss tuning with distribution shift on MNIST across five runs, comparing OGD zinkevich2003online, OAGD tarzanagh2024online, SOBOW lin2024non, and our SOGD.

Theorems & Definitions (85)

  • Lemma 2.1
  • Theorem 2.6
  • Remark 2.7: Stochastic Regret Guarantee for OBO and OSO with $w=1$
  • Theorem 3.2
  • Remark 3.3: Regret Guarantee for Zeroth Order OBO
  • Remark 3.4: Improved Regret for OSO
  • Definition B.1: Projected gradient ghadimi2016mini
  • Lemma B.2
  • Lemma B.3
  • Lemma B.4
  • ...and 75 more