Table of Contents
Fetching ...

Single-loop Projection-free and Projected Gradient-based Algorithms for Nonconvex-concave Saddle Point Problems with Bilevel Structure

Mohammad Mahdi Ahmadi, Erfan Yazdandoost Hamedani

TL;DR

This work addresses constrained saddle-point problems with a bilevel structure, where the upper-level objective Φ is smooth and concave in the maximization variable and the lower-level objective g is strongly convex in its inner variable. It introduces two single-loop algorithms, i-BRPD:OPF (one-sided projection-free using a linear minimization oracle) and i-BRPD:FP (fully projected), built on an inexact bilevel regularized primal-dual framework that tracks the lower-level solution θ*(x) and an estimated gradient of the implicit objective. The paper proves convergence guarantees for both methods, with ε-stationary points achieved in O(ε^{-4}) iterations for OPF (improved to O(ε^{-3}) when Φ is linear in y) and O(ε^{-5}) iterations for FP (improved to O(ε^{-4}) when Φ is strongly concave in y). Numerical experiments on robust multi-task regression demonstrate that the proposed projection-free method often outperforms existing approaches like MORBiT, validating the practical efficiency and broad applicability of the framework to robust ML tasks such as multi-task learning and adversarial training.

Abstract

In this paper, we explore a broad class of constrained saddle point problems with a bilevel structure, wherein the upper-level objective function is nonconvex-concave and smooth over compact and convex constraint sets, subject to a strongly convex lower-level objective function. This class of problems finds wide applicability in machine learning, encompassing robust multi-task learning, adversarial learning, and robust meta-learning. Our study extends the current literature in two main directions: (i) We consider a more general setting where the upper-level function is not necessarily strongly concave or linear in the maximization variable. (ii) While existing methods for solving saddle point problems with a bilevel structure are projection-based algorithms, we propose a one-sided projection-free method employing a linear minimization oracle. Specifically, by utilizing regularization and nested approximation techniques, we introduce a novel single-loop one-sided projection-free algorithm, requiring $\cO(ε^{-4})$ iterations to attain an $ε$-stationary solution, moreover, when the objective function in the upper-level is linear in the maximization component, our result improve to $\cO(ε^{-3})$. Subsequently, we develop an efficient single-loop fully projected gradient-based algorithm capable of achieving an $ε$-stationary solution within $\cO(ε^{-5})$ iterations. This result improves to $\cO(ε^{-4})$ when the upper-level objective function is strongly concave in the maximization component. Finally, we tested our proposed methods against the state-of-the-art algorithms for solving a robust multi-task regression problem to showcase the superiority of our algorithms.

Single-loop Projection-free and Projected Gradient-based Algorithms for Nonconvex-concave Saddle Point Problems with Bilevel Structure

TL;DR

This work addresses constrained saddle-point problems with a bilevel structure, where the upper-level objective Φ is smooth and concave in the maximization variable and the lower-level objective g is strongly convex in its inner variable. It introduces two single-loop algorithms, i-BRPD:OPF (one-sided projection-free using a linear minimization oracle) and i-BRPD:FP (fully projected), built on an inexact bilevel regularized primal-dual framework that tracks the lower-level solution θ*(x) and an estimated gradient of the implicit objective. The paper proves convergence guarantees for both methods, with ε-stationary points achieved in O(ε^{-4}) iterations for OPF (improved to O(ε^{-3}) when Φ is linear in y) and O(ε^{-5}) iterations for FP (improved to O(ε^{-4}) when Φ is strongly concave in y). Numerical experiments on robust multi-task regression demonstrate that the proposed projection-free method often outperforms existing approaches like MORBiT, validating the practical efficiency and broad applicability of the framework to robust ML tasks such as multi-task learning and adversarial training.

Abstract

In this paper, we explore a broad class of constrained saddle point problems with a bilevel structure, wherein the upper-level objective function is nonconvex-concave and smooth over compact and convex constraint sets, subject to a strongly convex lower-level objective function. This class of problems finds wide applicability in machine learning, encompassing robust multi-task learning, adversarial learning, and robust meta-learning. Our study extends the current literature in two main directions: (i) We consider a more general setting where the upper-level function is not necessarily strongly concave or linear in the maximization variable. (ii) While existing methods for solving saddle point problems with a bilevel structure are projection-based algorithms, we propose a one-sided projection-free method employing a linear minimization oracle. Specifically, by utilizing regularization and nested approximation techniques, we introduce a novel single-loop one-sided projection-free algorithm, requiring iterations to attain an -stationary solution, moreover, when the objective function in the upper-level is linear in the maximization component, our result improve to . Subsequently, we develop an efficient single-loop fully projected gradient-based algorithm capable of achieving an -stationary solution within iterations. This result improves to when the upper-level objective function is strongly concave in the maximization component. Finally, we tested our proposed methods against the state-of-the-art algorithms for solving a robust multi-task regression problem to showcase the superiority of our algorithms.
Paper Structure (25 sections, 18 theorems, 113 equations, 2 figures, 2 algorithms)

This paper contains 25 sections, 18 theorems, 113 equations, 2 figures, 2 algorithms.

Key Result

Lemma 2.1

Suppose Assumptions assump:grad-xytheta-lip and assump:g-conditions hold. Then for any $x, \overline{x} \in \mathcal{X}$ and $y, \overline{y} \in \mathcal{Y}$, we have that $\left\| v(x,y)-v(\overline{x},\overline{y}) \right\| \leq \mathbf{C}_{v1}\left\| x -\overline{x} \right\| + \mathbf{C}_{v2}\le

Figures (2)

  • Figure 1: Comparing the performance of our proposed algorithms i-BRPD:OPF (blue) and i-BRPD:FP (red) with MORBiT (green) in Robust Multi-task Linear Regression problem
  • Figure 2: Comparing the performance of our proposed algorithms i-BRPD:OPF (blue) and i-BRPD:FP (red) for solving robust multi-task linear regression problems when the upper-level objective function is nonlinear

Theorems & Definitions (27)

  • Remark 2.1
  • Definition 2.1
  • Definition 2.2
  • Definition 2.3
  • Remark 2.2
  • Lemma 2.1
  • Lemma 2.2
  • Lemma 4.1
  • Lemma 4.2
  • Lemma 4.3
  • ...and 17 more