Nonsmooth Nonconvex-Nonconcave Minimax Optimization: Primal-Dual Balancing and Iteration Complexity Analysis

Jiajin Li; Linglingzhi Zhu; Anthony Man-Cho So

Nonsmooth Nonconvex-Nonconcave Minimax Optimization: Primal-Dual Balancing and Iteration Complexity Analysis

Jiajin Li, Linglingzhi Zhu, Anthony Man-Cho So

TL;DR

This work tackles nonsmooth, nonconvex-nonconcave minimax optimization by introducing smoothed PLDA, a proximal-linear descent-ascent method that handles a composite primal structure with a KL-amenable dual. The authors build a Lyapunov-based convergence framework with sharp primal and dual error bounds, establishing iteration complexities of $O(ε^{-2\max\{2\theta,1\}})$ to reach $ε$-GS/$ε$-OS, and showing the optimal $O(ε^{-2})$ rate when $\theta\in[0,\tfrac{1}{2}]$. They reveal a phase-transition in complexity governed by the KL exponent: for $\theta\in(\tfrac{1}{2},1)$ the rate is $O(ε^{-4\theta})$, while for smaller $\theta$ primal updates dominate and limit the rate to $O(ε^{-2})$. The paper also proves KL properties for max-structured problems with $\theta=0$ and establishes algorithm-independent relationships among MM, GS, OS, and eOS, enhancing understanding of stationarity in minimax settings. These results jointly advance principled design of balanced primal-dual methods for broad classes of nonsmooth minimax problems with practical impact in ML and optimization.

Abstract

Nonconvex-nonconcave minimax optimization has gained widespread interest over the last decade. However, most existing works focus on variants of gradient descent-ascent (GDA) algorithms, which are only applicable to smooth nonconvex-concave settings. To address this limitation, we propose a novel algorithm named smoothed proximal linear descent-ascent (smoothed PLDA), which can effectively handle a broad range of structured nonsmooth nonconvex-nonconcave minimax problems. Specifically, we consider the setting where the primal function has a nonsmooth composite structure and the dual function possesses the Kurdyka-Lojasiewicz (KL) property with exponent $θ\in [0,1)$. We introduce a novel convergence analysis framework for smoothed PLDA, the key components of which are our newly developed nonsmooth primal error bound and dual error bound. Using this framework, we show that smoothed PLDA can find both $ε$-game-stationary points and $ε$-optimization-stationary points of the problems of interest in $\mathcal{O}(ε^{-2\max\{2θ,1\}})$ iterations. Furthermore, when $θ\in [0,\frac{1}{2}]$, smoothed PLDA achieves the optimal iteration complexity of $\mathcal{O}(ε^{-2})$. To further demonstrate the effectiveness and wide applicability of our analysis framework, we show that certain max-structured problem possesses the KL property with exponent $θ=0$ under mild assumptions. As a by-product, we establish algorithm-independent quantitative relationships among various stationarity concepts, which may be of independent interest.

Nonsmooth Nonconvex-Nonconcave Minimax Optimization: Primal-Dual Balancing and Iteration Complexity Analysis

TL;DR

to reach

-GS/

-OS, and showing the optimal

rate when

. They reveal a phase-transition in complexity governed by the KL exponent: for

the rate is

, while for smaller

primal updates dominate and limit the rate to

. The paper also proves KL properties for max-structured problems with

and establishes algorithm-independent relationships among MM, GS, OS, and eOS, enhancing understanding of stationarity in minimax settings. These results jointly advance principled design of balanced primal-dual methods for broad classes of nonsmooth minimax problems with practical impact in ML and optimization.

Abstract

. We introduce a novel convergence analysis framework for smoothed PLDA, the key components of which are our newly developed nonsmooth primal error bound and dual error bound. Using this framework, we show that smoothed PLDA can find both

-game-stationary points and

-optimization-stationary points of the problems of interest in

iterations. Furthermore, when

, smoothed PLDA achieves the optimal iteration complexity of

. To further demonstrate the effectiveness and wide applicability of our analysis framework, we show that certain max-structured problem possesses the KL property with exponent

under mild assumptions. As a by-product, we establish algorithm-independent quantitative relationships among various stationarity concepts, which may be of independent interest.

Paper Structure (20 sections, 17 theorems, 157 equations, 2 figures, 2 tables, 1 algorithm)

This paper contains 20 sections, 17 theorems, 157 equations, 2 figures, 2 tables, 1 algorithm.

Introduction
Main Contributions
Structure of the paper
Motivating Applications
Preliminaries
Proposed Algorithm --- Smoothed PLDA
Convergence Analysis of Smoothed PLDA
Analysis Framework
Primal Error Bound
Dual Error Bound
Iteration Complexity of Smoothed PLDA
Phase Transition Phenomenon
Verification of KŁ Property
Quantitative Relationships among Different Stationarity Concepts
Closing Remarks
...and 5 more sections

Key Result

proposition 1

Let where $\zeta>0$ is the constant defined in Proposition prop:lip. Then, for any $k\ge0$, we have where $\Phi_r^k:= \Phi_r(x^k,y^k,z^k)$.

Figures (2)

Figure 1: Comparison between the dual error bound (Proposition \ref{['prop:dual_eb_KL']}) and the pure dual error bound.
Figure 2: Possible relationships among the three types of stationarity points MM, GS, and eOS: (a) $F(x,y)=x^3-2xy-y^2$ with $\mathcal{X}\times\mathcal{Y}=[-1,1]\times [-1,1]$; (b) $F(x,y)=\sin(x)y$ with $\mathcal{X}\times\mathcal{Y}=[-\frac{\pi}{2},\frac{\pi}{2}] \times [-1,1]$; (c) $F(x,y)=xy$ with $\mathcal{X}\times\mathcal{Y}={\mathbb{R}} \times [-1,1]$.

Theorems & Definitions (47)

remark 1
definition 1: Stationarity measures
remark 2: Primal update
proposition 1: Basic descent estimate of $\Phi_r$
proposition 2: Lipschitz-type primal error bound
proof
remark 3
proposition 3: Dual error bound with KŁ exponent
proof
corollary 1: Alternative dual error bound with KŁ exponent
...and 37 more

Nonsmooth Nonconvex-Nonconcave Minimax Optimization: Primal-Dual Balancing and Iteration Complexity Analysis

TL;DR

Abstract

Nonsmooth Nonconvex-Nonconcave Minimax Optimization: Primal-Dual Balancing and Iteration Complexity Analysis

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (47)