Online Min-Max Optimization: From Individual Regrets to Cumulative Saddle Points

Abhijeet Vyas; Brian Bullins

Online Min-Max Optimization: From Individual Regrets to Cumulative Saddle Points

Abhijeet Vyas, Brian Bullins

TL;DR

The paper addresses online min-max optimization in settings beyond convex-concave, introducing two fresh performance notions, the static duality gap $ ext{SDual-Gap}_T$ and the dynamic saddle point regret $ ext{DSP-Reg}_T$, to capture progress toward a cumulative saddle point. It develops algorithms OGDA and OMMNS, with reductions to online convex optimization, and derives sublinear bounds for $ ext{SDual-Gap}_T$ and $ ext{DSP-Reg}_T$ under strong convexity-strong concavity and min-max exponential concavity, respectively; it also extends results to time-varying variational inequalities under a lower-regularity operator. A two-player portfolio-selection variant and a dynamic zero-sum game analysis under two-sided PL conditions illustrate practical implications and linear convergence guarantees, while a dynamic-regret framework in the sleeping-experts setting yields robust performance in non-stationary environments. Overall, the work provides a cohesive framework for tracking cumulative saddle points in online min-max problems, with convergence rates for averaged strategies and dynamic regret bounds that generalize and unify several online learning and VI results.

Abstract

We propose and study an online version of min-max optimization based on cumulative saddle points under a variety of performance measures beyond convex-concave settings. After first observing the incompatibility of (static) Nash equilibrium (SNE-Reg$_T$) with individual regrets even for strongly convex-strongly concave functions, we propose an alternate \emph{static} duality gap (SDual-Gap$_T$) inspired by the online convex optimization (OCO) framework. We provide algorithms that, using a reduction to classic OCO problems, achieve bounds for SDual-Gap$_T$~and a novel \emph{dynamic} saddle point regret (DSP-Reg$_T$), which we suggest naturally represents a min-max version of the dynamic regret in OCO. We derive our bounds for SDual-Gap$_T$~and DSP-Reg$_T$~under strong convexity-strong concavity and a min-max notion of exponential concavity (min-max EC), and in addition we establish a class of functions satisfying min-max EC~that captures a two-player variant of the classic portfolio selection problem. Finally, for a dynamic notion of regret compatible with individual regrets, we derive bounds under a two-sided Polyak-Łojasiewicz (PL) condition.

Online Min-Max Optimization: From Individual Regrets to Cumulative Saddle Points

TL;DR

The paper addresses online min-max optimization in settings beyond convex-concave, introducing two fresh performance notions, the static duality gap

and the dynamic saddle point regret

, to capture progress toward a cumulative saddle point. It develops algorithms OGDA and OMMNS, with reductions to online convex optimization, and derives sublinear bounds for

and

under strong convexity-strong concavity and min-max exponential concavity, respectively; it also extends results to time-varying variational inequalities under a lower-regularity operator. A two-player portfolio-selection variant and a dynamic zero-sum game analysis under two-sided PL conditions illustrate practical implications and linear convergence guarantees, while a dynamic-regret framework in the sleeping-experts setting yields robust performance in non-stationary environments. Overall, the work provides a cohesive framework for tracking cumulative saddle points in online min-max problems, with convergence rates for averaged strategies and dynamic regret bounds that generalize and unify several online learning and VI results.

Abstract

) with individual regrets even for strongly convex-strongly concave functions, we propose an alternate \emph{static} duality gap (SDual-Gap

) inspired by the online convex optimization (OCO) framework. We provide algorithms that, using a reduction to classic OCO problems, achieve bounds for SDual-Gap

~and a novel \emph{dynamic} saddle point regret (DSP-Reg

), which we suggest naturally represents a min-max version of the dynamic regret in OCO. We derive our bounds for SDual-Gap

~and DSP-Reg

~under strong convexity-strong concavity and a min-max notion of exponential concavity (min-max EC), and in addition we establish a class of functions satisfying min-max EC~that captures a two-player variant of the classic portfolio selection problem. Finally, for a dynamic notion of regret compatible with individual regrets, we derive bounds under a two-sided Polyak-Łojasiewicz (PL) condition.

Paper Structure (30 sections, 29 theorems, 164 equations, 6 algorithms)

This paper contains 30 sections, 29 theorems, 164 equations, 6 algorithms.

Introduction
Our contributions
Static performance measures in OMMO.
Dynamic performance measures in OMMO.
Time varying zero-sum games.
Preliminaries
The online min-max setting
OMMO for Functions with Lower Regularity
Two-player portfolio selection.
Static performance
Convergence to the cumulative saddle points
$\text{SDual-Gap}_T$ vs. $\text{DNE-Reg}_T$
Dynamic regret
Time-varying variational inequality objective under lower regularity
Time-Varying Zero-Sum Games
...and 15 more sections

Key Result

Theorem 3.4

Online gradient descent ascent (Algorithm alg:ogda), when run on $\lambda$ strongly convex-strongly concave functions $f_t$ over domain $\mathcal{X}\times \mathcal{Y} \subseteq \mathbb{R}^d$ with maximum operator norm $L_0$, generate action-pairs $\{x_t,y_t\}_{t=1}^T$ such that $\text{SDual-Gap}_T$$

Theorems & Definitions (69)

Definition 2.1: Saddle point
Definition 3.1: Strong convexity-strong concavity
Definition 3.2: Exponentially Concavity (EC)
Definition 3.3: $\text{min-max EC}$
Example 3.3
Theorem 3.4
Theorem 3.5
Corollary 3.6
proof
Definition 3.7: Dynamic Nash Equilibrium Regret, $\text{DNE-Reg}_T$
...and 59 more

Online Min-Max Optimization: From Individual Regrets to Cumulative Saddle Points

TL;DR

Abstract

Online Min-Max Optimization: From Individual Regrets to Cumulative Saddle Points

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (69)