Table of Contents
Fetching ...

Online Min-Max Optimization: From Individual Regrets to Cumulative Saddle Points

Abhijeet Vyas, Brian Bullins

TL;DR

The paper addresses online min-max optimization in settings beyond convex-concave, introducing two fresh performance notions, the static duality gap $ ext{SDual-Gap}_T$ and the dynamic saddle point regret $ ext{DSP-Reg}_T$, to capture progress toward a cumulative saddle point. It develops algorithms OGDA and OMMNS, with reductions to online convex optimization, and derives sublinear bounds for $ ext{SDual-Gap}_T$ and $ ext{DSP-Reg}_T$ under strong convexity-strong concavity and min-max exponential concavity, respectively; it also extends results to time-varying variational inequalities under a lower-regularity operator. A two-player portfolio-selection variant and a dynamic zero-sum game analysis under two-sided PL conditions illustrate practical implications and linear convergence guarantees, while a dynamic-regret framework in the sleeping-experts setting yields robust performance in non-stationary environments. Overall, the work provides a cohesive framework for tracking cumulative saddle points in online min-max problems, with convergence rates for averaged strategies and dynamic regret bounds that generalize and unify several online learning and VI results.

Abstract

We propose and study an online version of min-max optimization based on cumulative saddle points under a variety of performance measures beyond convex-concave settings. After first observing the incompatibility of (static) Nash equilibrium (SNE-Reg$_T$) with individual regrets even for strongly convex-strongly concave functions, we propose an alternate \emph{static} duality gap (SDual-Gap$_T$) inspired by the online convex optimization (OCO) framework. We provide algorithms that, using a reduction to classic OCO problems, achieve bounds for SDual-Gap$_T$~and a novel \emph{dynamic} saddle point regret (DSP-Reg$_T$), which we suggest naturally represents a min-max version of the dynamic regret in OCO. We derive our bounds for SDual-Gap$_T$~and DSP-Reg$_T$~under strong convexity-strong concavity and a min-max notion of exponential concavity (min-max EC), and in addition we establish a class of functions satisfying min-max EC~that captures a two-player variant of the classic portfolio selection problem. Finally, for a dynamic notion of regret compatible with individual regrets, we derive bounds under a two-sided Polyak-Łojasiewicz (PL) condition.

Online Min-Max Optimization: From Individual Regrets to Cumulative Saddle Points

TL;DR

The paper addresses online min-max optimization in settings beyond convex-concave, introducing two fresh performance notions, the static duality gap and the dynamic saddle point regret , to capture progress toward a cumulative saddle point. It develops algorithms OGDA and OMMNS, with reductions to online convex optimization, and derives sublinear bounds for and under strong convexity-strong concavity and min-max exponential concavity, respectively; it also extends results to time-varying variational inequalities under a lower-regularity operator. A two-player portfolio-selection variant and a dynamic zero-sum game analysis under two-sided PL conditions illustrate practical implications and linear convergence guarantees, while a dynamic-regret framework in the sleeping-experts setting yields robust performance in non-stationary environments. Overall, the work provides a cohesive framework for tracking cumulative saddle points in online min-max problems, with convergence rates for averaged strategies and dynamic regret bounds that generalize and unify several online learning and VI results.

Abstract

We propose and study an online version of min-max optimization based on cumulative saddle points under a variety of performance measures beyond convex-concave settings. After first observing the incompatibility of (static) Nash equilibrium (SNE-Reg) with individual regrets even for strongly convex-strongly concave functions, we propose an alternate \emph{static} duality gap (SDual-Gap) inspired by the online convex optimization (OCO) framework. We provide algorithms that, using a reduction to classic OCO problems, achieve bounds for SDual-Gap~and a novel \emph{dynamic} saddle point regret (DSP-Reg), which we suggest naturally represents a min-max version of the dynamic regret in OCO. We derive our bounds for SDual-Gap~and DSP-Reg~under strong convexity-strong concavity and a min-max notion of exponential concavity (min-max EC), and in addition we establish a class of functions satisfying min-max EC~that captures a two-player variant of the classic portfolio selection problem. Finally, for a dynamic notion of regret compatible with individual regrets, we derive bounds under a two-sided Polyak-Łojasiewicz (PL) condition.
Paper Structure (30 sections, 29 theorems, 164 equations, 6 algorithms)

This paper contains 30 sections, 29 theorems, 164 equations, 6 algorithms.

Key Result

Theorem 3.4

Online gradient descent ascent (Algorithm alg:ogda), when run on $\lambda$ strongly convex-strongly concave functions $f_t$ over domain $\mathcal{X}\times \mathcal{Y} \subseteq \mathbb{R}^d$ with maximum operator norm $L_0$, generate action-pairs $\{x_t,y_t\}_{t=1}^T$ such that $\text{SDual-Gap}_T$$

Theorems & Definitions (69)

  • Definition 2.1: Saddle point
  • Definition 3.1: Strong convexity-strong concavity
  • Definition 3.2: Exponentially Concavity (EC)
  • Definition 3.3: $\text{min-max EC}$
  • Example 3.3
  • Theorem 3.4
  • Theorem 3.5
  • Corollary 3.6
  • proof
  • Definition 3.7: Dynamic Nash Equilibrium Regret, $\text{DNE-Reg}_T$
  • ...and 59 more