Single-loop Projection-free and Projected Gradient-based Algorithms for Nonconvex-concave Saddle Point Problems with Bilevel Structure

Mohammad Mahdi Ahmadi; Erfan Yazdandoost Hamedani

Single-loop Projection-free and Projected Gradient-based Algorithms for Nonconvex-concave Saddle Point Problems with Bilevel Structure

Mohammad Mahdi Ahmadi, Erfan Yazdandoost Hamedani

TL;DR

This work addresses constrained saddle-point problems with a bilevel structure, where the upper-level objective Φ is smooth and concave in the maximization variable and the lower-level objective g is strongly convex in its inner variable. It introduces two single-loop algorithms, i-BRPD:OPF (one-sided projection-free using a linear minimization oracle) and i-BRPD:FP (fully projected), built on an inexact bilevel regularized primal-dual framework that tracks the lower-level solution θ*(x) and an estimated gradient of the implicit objective. The paper proves convergence guarantees for both methods, with ε-stationary points achieved in O(ε^{-4}) iterations for OPF (improved to O(ε^{-3}) when Φ is linear in y) and O(ε^{-5}) iterations for FP (improved to O(ε^{-4}) when Φ is strongly concave in y). Numerical experiments on robust multi-task regression demonstrate that the proposed projection-free method often outperforms existing approaches like MORBiT, validating the practical efficiency and broad applicability of the framework to robust ML tasks such as multi-task learning and adversarial training.

Abstract

In this paper, we explore a broad class of constrained saddle point problems with a bilevel structure, wherein the upper-level objective function is nonconvex-concave and smooth over compact and convex constraint sets, subject to a strongly convex lower-level objective function. This class of problems finds wide applicability in machine learning, encompassing robust multi-task learning, adversarial learning, and robust meta-learning. Our study extends the current literature in two main directions: (i) We consider a more general setting where the upper-level function is not necessarily strongly concave or linear in the maximization variable. (ii) While existing methods for solving saddle point problems with a bilevel structure are projection-based algorithms, we propose a one-sided projection-free method employing a linear minimization oracle. Specifically, by utilizing regularization and nested approximation techniques, we introduce a novel single-loop one-sided projection-free algorithm, requiring $\cO(ε^{-4})$ iterations to attain an $ε$-stationary solution, moreover, when the objective function in the upper-level is linear in the maximization component, our result improve to $\cO(ε^{-3})$. Subsequently, we develop an efficient single-loop fully projected gradient-based algorithm capable of achieving an $ε$-stationary solution within $\cO(ε^{-5})$ iterations. This result improves to $\cO(ε^{-4})$ when the upper-level objective function is strongly concave in the maximization component. Finally, we tested our proposed methods against the state-of-the-art algorithms for solving a robust multi-task regression problem to showcase the superiority of our algorithms.

Single-loop Projection-free and Projected Gradient-based Algorithms for Nonconvex-concave Saddle Point Problems with Bilevel Structure

TL;DR

Abstract

iterations to attain an

-stationary solution, moreover, when the objective function in the upper-level is linear in the maximization component, our result improve to

. Subsequently, we develop an efficient single-loop fully projected gradient-based algorithm capable of achieving an

-stationary solution within

iterations. This result improves to

when the upper-level objective function is strongly concave in the maximization component. Finally, we tested our proposed methods against the state-of-the-art algorithms for solving a robust multi-task regression problem to showcase the superiority of our algorithms.

Paper Structure (25 sections, 18 theorems, 113 equations, 2 figures, 2 algorithms)

This paper contains 25 sections, 18 theorems, 113 equations, 2 figures, 2 algorithms.

Introduction
Literature Review
Bilevel Optimization.
Saddle Point (SP) Problem.
Saddle Point Problem with Bilevel Structure.
Contribution
Motivating Examples
Preliminaries
Assumptions and Definitions
Problem Properties
Proposed Methods
Analysis of the Proposed Methods
Analysis of the Lower-level Approximation
Analysis of Tracking the Optimal Solution Trajectory of the Maximization Component
Analysis of Gradients Estimation
...and 10 more sections

Key Result

Lemma 2.1

Suppose Assumptions assump:grad-xytheta-lip and assump:g-conditions hold. Then for any $x, \overline{x} \in \mathcal{X}$ and $y, \overline{y} \in \mathcal{Y}$, we have that $\left\| v(x,y)-v(\overline{x},\overline{y}) \right\| \leq \mathbf{C}_{v1}\left\| x -\overline{x} \right\| + \mathbf{C}_{v2}\le

Figures (2)

Figure 1: Comparing the performance of our proposed algorithms i-BRPD:OPF (blue) and i-BRPD:FP (red) with MORBiT (green) in Robust Multi-task Linear Regression problem
Figure 2: Comparing the performance of our proposed algorithms i-BRPD:OPF (blue) and i-BRPD:FP (red) for solving robust multi-task linear regression problems when the upper-level objective function is nonlinear

Theorems & Definitions (27)

Remark 2.1
Definition 2.1
Definition 2.2
Definition 2.3
Remark 2.2
Lemma 2.1
Lemma 2.2
Lemma 4.1
Lemma 4.2
Lemma 4.3
...and 17 more

Single-loop Projection-free and Projected Gradient-based Algorithms for Nonconvex-concave Saddle Point Problems with Bilevel Structure

TL;DR

Abstract

Single-loop Projection-free and Projected Gradient-based Algorithms for Nonconvex-concave Saddle Point Problems with Bilevel Structure

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (27)