Table of Contents
Fetching ...

Functionally Constrained Algorithm Solves Convex Simple Bilevel Problems

Huaqing Zhang, Lesi Chen, Jing Xu, Jingzhao Zhang

TL;DR

The paper investigates simple bilevel optimization where the upper-level objective $f$ is minimized over the solution set of a convex lower-level problem $\min_{\mathbf z\in\mathcal{Z}} g(\mathbf z)$. It proves that absolute optimality is impossible for zero-respecting first-order methods and proposes FC-BiO, a functionally constrained reformulation that achieves near-optimal rates for both Lipschitz and smooth regimes in finding $(\epsilon_f,\epsilon_g)$-weak optimal solutions. The authors establish matching lower bounds and provide a two-loop algorithm with a bisection outer loop and minimax inner solves, using a Subgradient Method in the Lipschitz case and a generalized Accelerated Gradient Method in the smooth case. The approach yields theoretical guarantees within logarithmic factors of the lower bounds and demonstrates empirical efficiency on minimum-norm and over-parameterized logistic regression bilevel problems. This work advances practical and theoretical understanding of simple bilevel problems and offers a scalable framework applicable under standard smoothness or Lipschitz assumptions.

Abstract

This paper studies simple bilevel problems, where a convex upper-level function is minimized over the optimal solutions of a convex lower-level problem. We first show the fundamental difficulty of simple bilevel problems, that the approximate optimal value of such problems is not obtainable by first-order zero-respecting algorithms. Then we follow recent works to pursue the weak approximate solutions. For this goal, we propose a novel method by reformulating them into functionally constrained problems. Our method achieves near-optimal rates for both smooth and nonsmooth problems. To the best of our knowledge, this is the first near-optimal algorithm that works under standard assumptions of smoothness or Lipschitz continuity for the objective functions.

Functionally Constrained Algorithm Solves Convex Simple Bilevel Problems

TL;DR

The paper investigates simple bilevel optimization where the upper-level objective is minimized over the solution set of a convex lower-level problem . It proves that absolute optimality is impossible for zero-respecting first-order methods and proposes FC-BiO, a functionally constrained reformulation that achieves near-optimal rates for both Lipschitz and smooth regimes in finding -weak optimal solutions. The authors establish matching lower bounds and provide a two-loop algorithm with a bisection outer loop and minimax inner solves, using a Subgradient Method in the Lipschitz case and a generalized Accelerated Gradient Method in the smooth case. The approach yields theoretical guarantees within logarithmic factors of the lower bounds and demonstrates empirical efficiency on minimum-norm and over-parameterized logistic regression bilevel problems. This work advances practical and theoretical understanding of simple bilevel problems and offers a scalable framework applicable under standard smoothness or Lipschitz assumptions.

Abstract

This paper studies simple bilevel problems, where a convex upper-level function is minimized over the optimal solutions of a convex lower-level problem. We first show the fundamental difficulty of simple bilevel problems, that the approximate optimal value of such problems is not obtainable by first-order zero-respecting algorithms. Then we follow recent works to pursue the weak approximate solutions. For this goal, we propose a novel method by reformulating them into functionally constrained problems. Our method achieves near-optimal rates for both smooth and nonsmooth problems. To the best of our knowledge, this is the first near-optimal algorithm that works under standard assumptions of smoothness or Lipschitz continuity for the objective functions.
Paper Structure (36 sections, 22 theorems, 82 equations, 2 figures, 3 algorithms)

This paper contains 36 sections, 22 theorems, 82 equations, 2 figures, 3 algorithms.

Key Result

Theorem 4.1

For any first-order algorithm $\mathcal{A}$ satisfying Assumption algorithm class that runs for $T$ iterations and any initial point ${\bf{x}}_0$, there exists a $(1,1)$-smooth instance of Problem (Simple BiO) such that the optimal solution ${\bf{x}}^*$ satisfies $\|{\bf{x}}_0-{\bf{x}}^*\|_2\leq 1$

Figures (2)

  • Figure 1: The performance of Algorithm \ref{['alg: smooth']} compared with other methods in Problem (\ref{['eq:MNP']}).
  • Figure 2: The performance of Algorithm \ref{['alg: smooth']} compared with other methods in Problem (\ref{['exp: OPR']})

Theorems & Definitions (42)

  • Definition 3.1: first-order zero-chain
  • Theorem 4.1
  • Theorem 4.2
  • Remark 4.1
  • Theorem 5.1
  • Theorem 5.2
  • Lemma 5.1
  • Lemma 5.2: nesterov2018lectures
  • Lemma 5.3
  • Proposition 5.1: nesterov2018lectures
  • ...and 32 more