Table of Contents
Fetching ...

An Accelerated Gradient Method for Convex Smooth Simple Bilevel Optimization

Jincheng Cao, Ruichen Jiang, Erfan Yazdandoost Hamedani, Aryan Mokhtari

TL;DR

This work tackles simple bilevel optimization with convex smooth upper and lower objectives by introducing AGM-BiO, an accelerated gradient method that uses a cutting-plane surrogate to approximate the lower-level solution set and a projection-based update for the upper level. It establishes non-asymptotic convergence guarantees, achieving suboptimality and infeasibility bounds that scale as O(max{1/√ε_f, 1/ε_g}) under compact feasibility and extendable to Hölderian error bounds with rates O(max{ε_f^{−(2r−1)/(2r)}, ε_g^{−(2r−1)/(2r)}}). Under weak sharpness (r=1), the method attains O(1/√ε_f) and O(1/√ε_g) absolute-optimal guarantees, aligning with optimal single-level rates. The paper also demonstrates practical effectiveness through numerical experiments, showing strong performance against several baselines, especially in high-dimensional settings, and discusses extensions to non-smooth/composite cases. Overall, AGM-BiO provides a principled, accelerated framework for convex simple bilevel problems with strong non-asymptotic guarantees and practical relevance.

Abstract

In this paper, we focus on simple bilevel optimization problems, where we minimize a convex smooth objective function over the optimal solution set of another convex smooth constrained optimization problem. We present a novel bilevel optimization method that locally approximates the solution set of the lower-level problem using a cutting plane approach and employs an accelerated gradient-based update to reduce the upper-level objective function over the approximated solution set. We measure the performance of our method in terms of suboptimality and infeasibility errors and provide non-asymptotic convergence guarantees for both error criteria. Specifically, when the feasible set is compact, we show that our method requires at most $\mathcal{O}(\max\{1/\sqrt{ε_{f}}, 1/ε_g\})$ iterations to find a solution that is $ε_f$-suboptimal and $ε_g$-infeasible. Moreover, under the additional assumption that the lower-level objective satisfies the $r$-th Hölderian error bound, we show that our method achieves an iteration complexity of $\mathcal{O}(\max\{ε_{f}^{-\frac{2r-1}{2r}},ε_{g}^{-\frac{2r-1}{2r}}\})$, which matches the optimal complexity of single-level convex constrained optimization when $r=1$.

An Accelerated Gradient Method for Convex Smooth Simple Bilevel Optimization

TL;DR

This work tackles simple bilevel optimization with convex smooth upper and lower objectives by introducing AGM-BiO, an accelerated gradient method that uses a cutting-plane surrogate to approximate the lower-level solution set and a projection-based update for the upper level. It establishes non-asymptotic convergence guarantees, achieving suboptimality and infeasibility bounds that scale as O(max{1/√ε_f, 1/ε_g}) under compact feasibility and extendable to Hölderian error bounds with rates O(max{ε_f^{−(2r−1)/(2r)}, ε_g^{−(2r−1)/(2r)}}). Under weak sharpness (r=1), the method attains O(1/√ε_f) and O(1/√ε_g) absolute-optimal guarantees, aligning with optimal single-level rates. The paper also demonstrates practical effectiveness through numerical experiments, showing strong performance against several baselines, especially in high-dimensional settings, and discusses extensions to non-smooth/composite cases. Overall, AGM-BiO provides a principled, accelerated framework for convex simple bilevel problems with strong non-asymptotic guarantees and practical relevance.

Abstract

In this paper, we focus on simple bilevel optimization problems, where we minimize a convex smooth objective function over the optimal solution set of another convex smooth constrained optimization problem. We present a novel bilevel optimization method that locally approximates the solution set of the lower-level problem using a cutting plane approach and employs an accelerated gradient-based update to reduce the upper-level objective function over the approximated solution set. We measure the performance of our method in terms of suboptimality and infeasibility errors and provide non-asymptotic convergence guarantees for both error criteria. Specifically, when the feasible set is compact, we show that our method requires at most iterations to find a solution that is -suboptimal and -infeasible. Moreover, under the additional assumption that the lower-level objective satisfies the -th Hölderian error bound, we show that our method achieves an iteration complexity of , which matches the optimal complexity of single-level convex constrained optimization when .
Paper Structure (18 sections, 7 theorems, 86 equations, 2 figures, 1 table, 2 algorithms)

This paper contains 18 sections, 7 theorems, 86 equations, 2 figures, 1 table, 2 algorithms.

Key Result

Theorem 4.1

Suppose Assumption ass:1 holds. Let $\{\mathbf{x}_k\}_{k\geq 0}$ be the sequence of iterates generated by Algorithm alg:AGM-BiO with stepsize $a_k = (k+1)/(4L_f)$ for $k \geq 0$ and suppose the sequence $g_k$ used for generating the cutting plane satisfies eq:convergence_g. Then, for any $k\geq 0$ w

Figures (2)

  • Figure 1: Comparison of a-IRG, CG-BiO, Bi-SG, SEA, R-APM, PB-APG, and AGM-BiO for solving the over-parameterized regression problem.
  • Figure 2: Comparison of a-IRG, Bi-SG, SEA, R-APM, PB-APG, Bisec-BiO, and AGM-BiO for solving the linear inverse problem.

Theorems & Definitions (20)

  • Definition 2.1
  • Definition 2.2
  • Remark 3.1
  • Remark 3.2
  • Theorem 4.1
  • Remark 4.1: The necessity of compactness of $\mathcal{Z}$
  • Remark 4.2: Removable $\log$ terms
  • Proposition 4.2: jiangconditional
  • Lemma 4.3
  • Theorem 4.4
  • ...and 10 more