Table of Contents
Fetching ...

Safe Gradient Flow for Bilevel Optimization

Sina Sharifi, Nazanin Abolfazli, Erfan Yazdandoost Hamedani, Mahyar Fazlyab

TL;DR

This work addresses the computational challenge of bilevel optimization by marrying a gradient-flow strategy for the upper-level objective with a safety filter that enforces the lower-level optimality constraints in a single loop. The core approach, termed safe gradient flow (SGF), projects the velocity onto a constraint-consistent manifold by solving a convex QP, yielding closed-form updates that guarantee forward invariance of the lower-level constraint set and convergence to a neighborhood of the bilevel optimum via Lyapunov analysis. To scale to high-dimensional lower-level problems, the authors derive an inversion-free variant that replaces Hessian inversions with a single-level reformulation using a barrier-inspired constraint $h(x,y)=\|\nabla_y g(x,y)\|^2$, plus a relaxed version with $h(x,y) \le \varepsilon^2$ and a controlled-proximity term, both accompanied by convergence guarantees. Theoretical results establish Lipschitz continuity of the projection-based dynamics and nonincreasing Lyapunov energy, while experiments on synthetic benchmarks and MNIST-based data hyper-cleaning demonstrate practical performance gains and robustness of the proposed methods. Overall, the paper provides a principled, controllable, and scalable framework for solving bilevel problems in a single loop with provable guarantees and empirical validation.

Abstract

Bilevel optimization is a key framework in hierarchical decision-making, where one problem is embedded within the constraints of another. In this work, we propose a control-theoretic approach to solving bilevel optimization problems. Our method consists of two components: a gradient flow mechanism to minimize the upper-level objective and a safety filter to enforce the constraints imposed by the lower-level problem. Together, these components form a safe gradient flow that solves the bilevel problem in a single loop. To improve scalability with respect to the lower-level problem's dimensions, we introduce a relaxed formulation and design a compact variant of the safe gradient flow. This variant minimizes the upper-level objective while ensuring the lower-level decision variable remains within a user-defined suboptimality. Using Lyapunov analysis, we establish convergence guarantees for the dynamics, proving that they converge to a neighborhood of the optimal solution. Numerical experiments further validate the effectiveness of the proposed approaches. Our contributions provide both theoretical insights and practical tools for efficiently solving bilevel optimization problems.

Safe Gradient Flow for Bilevel Optimization

TL;DR

This work addresses the computational challenge of bilevel optimization by marrying a gradient-flow strategy for the upper-level objective with a safety filter that enforces the lower-level optimality constraints in a single loop. The core approach, termed safe gradient flow (SGF), projects the velocity onto a constraint-consistent manifold by solving a convex QP, yielding closed-form updates that guarantee forward invariance of the lower-level constraint set and convergence to a neighborhood of the bilevel optimum via Lyapunov analysis. To scale to high-dimensional lower-level problems, the authors derive an inversion-free variant that replaces Hessian inversions with a single-level reformulation using a barrier-inspired constraint , plus a relaxed version with and a controlled-proximity term, both accompanied by convergence guarantees. Theoretical results establish Lipschitz continuity of the projection-based dynamics and nonincreasing Lyapunov energy, while experiments on synthetic benchmarks and MNIST-based data hyper-cleaning demonstrate practical performance gains and robustness of the proposed methods. Overall, the paper provides a principled, controllable, and scalable framework for solving bilevel problems in a single loop with provable guarantees and empirical validation.

Abstract

Bilevel optimization is a key framework in hierarchical decision-making, where one problem is embedded within the constraints of another. In this work, we propose a control-theoretic approach to solving bilevel optimization problems. Our method consists of two components: a gradient flow mechanism to minimize the upper-level objective and a safety filter to enforce the constraints imposed by the lower-level problem. Together, these components form a safe gradient flow that solves the bilevel problem in a single loop. To improve scalability with respect to the lower-level problem's dimensions, we introduce a relaxed formulation and design a compact variant of the safe gradient flow. This variant minimizes the upper-level objective while ensuring the lower-level decision variable remains within a user-defined suboptimality. Using Lyapunov analysis, we establish convergence guarantees for the dynamics, proving that they converge to a neighborhood of the optimal solution. Numerical experiments further validate the effectiveness of the proposed approaches. Our contributions provide both theoretical insights and practical tools for efficiently solving bilevel optimization problems.

Paper Structure

This paper contains 20 sections, 11 theorems, 61 equations, 4 figures.

Key Result

Lemma 1

Under Assumptions assumption:upperlevel and assumption:lowerlevel, there exists $M_1>0$ such that $\left\|\nabla \ell(x) - F(x,y)\right\| \leq M_1 \left\|y-y^\star(x)\right\|. \notag$ for all $x \in \mathbb{R}^n, y \in \mathbb{R}^m$.

Figures (4)

  • Figure 1: An overview of the proposed method. The gradient flow attempts to minimize the upper-level objective, while the safety filter ensures that the constraints induced by the lower-level problem are satisfied.
  • Figure 2: Effect of $\varepsilon$ on the convergence of the (left:) surrogate map and (middle:) lower-level problem with $\alpha = 0.01$. (right:) Effect of $\alpha$ on lower-level behavior.
  • Figure 3: The comparison of the validation loss and the test accuracy between our Inversion-free method and AIDBio with $\alpha, \beta \in \{0.001, , 0.01, 0.1\}$ for $p \in \{ 25\%$, $40\% \}$.
  • Figure 4: Comparison between our second order method and STABLE with $\alpha = 1$ and $\beta \in \{10^{-2}, 5\!\times\!10^{-2}, 10^{-1}, 5\!\times\!10^{-1} \}$.

Theorems & Definitions (26)

  • Lemma 1
  • proof
  • Theorem 1: Convergence of \ref{['eq: gradient flow upper level safe']}
  • proof
  • Proposition 1
  • proof
  • Theorem 2: Convergence of \ref{['eq: gradient flow upper level safe 2']}
  • proof
  • Remark 1: Comparison with liu2022bome
  • Lemma 2
  • ...and 16 more