Fully First-Order Algorithms for Online Bilevel Optimization

Tingkai Jia; Cheng Chen

Fully First-Order Algorithms for Online Bilevel Optimization

Tingkai Jia, Cheng Chen

TL;DR

This work tackles non-convex-upper-level online bilevel optimization with drift by eliminating second-order dependence. It reformulates the bilevel problem as a single-level online problem with an inequality constraint and develops a fully first-order online bilevel optimizer (F^2OBO) that uses a time-varying multiplier to approximate the original problem without Hessian–vector products. An adaptive inner-iteration variant (AF^2OBO) further reduces dependence on inner-drift (H_{2,T}) by tuning inner accuracy, achieving regret bounds that are robust to aggressive distribution shift. The results show sublinear bilevel local regret under favorable parameter choices (τ>1), with O(T log T) per-iteration complexity, and a static-environment regime yielding near-optimal first-order convergence; AF^2OBO offers a favorable trade-off when inner drift is substantial.

Abstract

In this work, we study non-convex-strongly-convex online bilevel optimization (OBO). Existing OBO algorithms are mainly based on hypergradient descent, which requires access to a Hessian-vector product (HVP) oracle and potentially incurs high computational costs. By reformulating the original OBO problem as a single-level online problem with inequality constraints and constructing a sequence of Lagrangian function, we eliminate the need for HVPs arising from implicit differentiation. Specifically, we propose a fully first-order algorithm for OBO, and provide theoretical guarantees showing that it achieves regret of $O(1 + V_T + H_{2,T})$. Furthermore, we develop an improved variant with an adaptive inner-iteration scheme, which removes the dependence on the drift variation of the inner-level optimal solution and achieves regret of $O(\sqrt{T} + V_T)$. This regret have the advatange when $V_{T}\ge O(\sqrt{T})$.

Fully First-Order Algorithms for Online Bilevel Optimization

TL;DR

Abstract

. Furthermore, we develop an improved variant with an adaptive inner-iteration scheme, which removes the dependence on the drift variation of the inner-level optimal solution and achieves regret of

. This regret have the advatange when

Paper Structure (18 sections, 15 theorems, 81 equations, 1 table, 2 algorithms)

This paper contains 18 sections, 15 theorems, 81 equations, 1 table, 2 algorithms.

Introduction
Our Contributions
Related Work
Preliminaries
Notations and Assumptions
Online Bilevel Optimization
Single-Level Reformulation of OBO Problem
A Fully First-Order Algorithm for Online Bilevel Optimization
Algorithm
Regret Analysis
An Adaptive Inner-Iteration variant of F$^2$OBO
Algorithm
Regret Analysis
Discussion
Conclusion
...and 3 more sections

Key Result

Lemma 3.1

For any $\mathbf{x}\in\mathcal{X}$, and $\lambda_t\geq\frac{2L_{f,1}}{\mu_g}$ for all $t\in[T]$, we have where $D_1$ is some constant.

Theorems & Definitions (22)

Lemma 3.1: Based on Lemma 3.1 in kwon2023fullyfirstordermethodstochastic
Lemma 3.2
Lemma 3.3
Theorem 3.4
Lemma 4.1
Theorem 4.2
Lemma A.1: Lemma 16 in tarzanagh2024onlinebileveloptimizationregret
Lemma A.2: Lemma 4.1 in chen2023near
Lemma A.3: Based on Lemma 3.2 in kwon2023fullyfirstordermethodstochastic
proof
...and 12 more

Fully First-Order Algorithms for Online Bilevel Optimization

TL;DR

Abstract

Fully First-Order Algorithms for Online Bilevel Optimization

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (22)