Table of Contents
Fetching ...

Alternating Gradient-Type Algorithm for Bilevel Optimization with Inexact Lower-Level Solutions via Moreau Envelope-based Reformulation

Xiaoning Bai, Shangzhi Zeng, Jin Zhang, Lezhi Zhang

TL;DR

This work tackles bilevel optimization with a convex lower-level by introducing AGILS, an alternating gradient-type algorithm that leverages a Moreau envelope reformulation $(\mathrm{VP})_\gamma$ and inexact proximal lower-level solves. AGILS incorporates an adaptive penalty and a feasibility-correction step to maintain constraint satisfaction, and it provides convergence guarantees to KKT points of a relaxed problem $(\mathrm{VP})_\gamma^{\epsilon}$, with subsequential convergence under mild assumptions and full sequence convergence under a KL property. The method is validated on a toy problem and a sparse group Lasso hyperparameter selection task, showing superior efficiency and robustness to the inexactness of the lower-level solutions compared to several baselines. Overall, AGILS offers a flexible, scalable approach for high-dimensional bilevel problems where exact lower-level solves are impractical, delivering provable convergence and competitive empirical performance.

Abstract

In this paper, we study a class of bilevel optimization problems where the lower-level problem is a convex composite optimization model, which arises in various applications, including bilevel hyperparameter selection for regularized regression models. To solve these problems, we propose an Alternating Gradient-type algorithm with Inexact Lower-level Solutions (AGILS) based on a Moreau envelope-based reformulation of the bilevel optimization problem. The proposed algorithm does not require exact solutions of the lower-level problem at each iteration, improving computational efficiency. We prove the convergence of AGILS to stationary points and, under the Kurdyka-Łojasiewicz (KL) property, establish its sequential convergence. Numerical experiments, including a toy example and a bilevel hyperparameter selection problem for the sparse group Lasso model, demonstrate the effectiveness of the proposed AGILS.

Alternating Gradient-Type Algorithm for Bilevel Optimization with Inexact Lower-Level Solutions via Moreau Envelope-based Reformulation

TL;DR

This work tackles bilevel optimization with a convex lower-level by introducing AGILS, an alternating gradient-type algorithm that leverages a Moreau envelope reformulation and inexact proximal lower-level solves. AGILS incorporates an adaptive penalty and a feasibility-correction step to maintain constraint satisfaction, and it provides convergence guarantees to KKT points of a relaxed problem , with subsequential convergence under mild assumptions and full sequence convergence under a KL property. The method is validated on a toy problem and a sparse group Lasso hyperparameter selection task, showing superior efficiency and robustness to the inexactness of the lower-level solutions compared to several baselines. Overall, AGILS offers a flexible, scalable approach for high-dimensional bilevel problems where exact lower-level solves are impractical, delivering provable convergence and competitive empirical performance.

Abstract

In this paper, we study a class of bilevel optimization problems where the lower-level problem is a convex composite optimization model, which arises in various applications, including bilevel hyperparameter selection for regularized regression models. To solve these problems, we propose an Alternating Gradient-type algorithm with Inexact Lower-level Solutions (AGILS) based on a Moreau envelope-based reformulation of the bilevel optimization problem. The proposed algorithm does not require exact solutions of the lower-level problem at each iteration, improving computational efficiency. We prove the convergence of AGILS to stationary points and, under the Kurdyka-Łojasiewicz (KL) property, establish its sequential convergence. Numerical experiments, including a toy example and a bilevel hyperparameter selection problem for the sparse group Lasso model, demonstrate the effectiveness of the proposed AGILS.

Paper Structure

This paper contains 14 sections, 17 theorems, 123 equations, 2 figures, 5 tables, 2 algorithms.

Key Result

Proposition 2.7

Suppose Assumptions asup2 and asup3 hold. For $\gamma \in (0, 1/(2\rho_{f_2} + 2\rho_{g_2}))$, the function $v_\gamma(x, y)$ is $(\rho_{v_1}, \rho_{v_2})$-weakly convex on $X \times \mathbb{R}^m$ with $\rho_{v_1} \geq \rho_{f_1} + \rho_{g_1}$ and $\rho_{v_2} \geq 1/\gamma$. Additionally, $v_\gamma(x

Figures (2)

  • Figure 1: Effectiveness of the inexact criterion in AGILS: comparison with two extreme variants
  • Figure 2: Iteration and computational time of AGILS on the toy example with varying dimensions

Theorems & Definitions (39)

  • Remark 2.4
  • Example 2.5
  • Example 2.6
  • Proposition 2.7
  • Proposition 2.8
  • proof
  • Lemma 2.9
  • proof
  • Definition 2.10
  • Remark 3.1
  • ...and 29 more