Solving bilevel optimization via sequential minimax optimization

Zhaosong Lu; Sanyou Mei

Solving bilevel optimization via sequential minimax optimization

Zhaosong Lu, Sanyou Mei

TL;DR

The paper tackles constrained bilevel optimization where the lower level is convex (possibly nonsmooth) and the upper level is nonconvex. It introduces sequential minimax optimization (SMO), which solves a sequence of minimax subproblems derived from a modified augmented-Lagrangian formulation, using only first-order information and proximal computations. The main theoretical contribution is sharp operation-complexity guarantees: SMO attains an ε-KKT solution in ${\cal O}(\varepsilon^{-7}\log \varepsilon^{-1})$ operations when the lower level is merely convex and ${\cal O}(\varepsilon^{-6}\log \varepsilon^{-1})$ when the lower level is strongly convex, improving prior results by a factor of $\varepsilon^{-1}$. Empirical results on constrained bilevel linear/quadratic problems and SVM hyperparameter tuning show SMO consistently outperforms a state-of-the-art first-order penalty method in runtime while delivering competitive solution quality.

Abstract

In this paper we propose a sequential minimax optimization (SMO) method for solving a class of constrained bilevel optimization problems in which the lower-level part is a possibly nonsmooth convex optimization problem, while the upper-level part is a possibly nonconvex optimization problem. Specifically, SMO applies a first-order method to solve a sequence of minimax subproblems, which are obtained by employing a hybrid of modified augmented Lagrangian and penalty schemes on the bilevel optimization problems. Under suitable assumptions, we establish an operation complexity of $O(\varepsilon^{-7}\log\varepsilon^{-1})$ and $O(\varepsilon^{-6}\log\varepsilon^{-1})$, measured in terms of fundamental operations, for SMO in finding an $\varepsilon$-KKT solution of the bilevel optimization problems with merely convex and strongly convex lower-level objective functions, respectively. The latter result improves the previous best-known operation complexity by a factor of $\varepsilon^{-1}$. Preliminary numerical results demonstrate significantly superior computational performance compared to the recently developed first-order penalty method.

Solving bilevel optimization via sequential minimax optimization

TL;DR

Abstract

Solving bilevel optimization via sequential minimax optimization

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Theorems & Definitions (32)