Table of Contents
Fetching ...

Smoothing Meets Perturbation: Unified and Tight Analysis for Nonconvex-Concave Minimax Optimization

Jiajin Li, Mahesh Nagarajan, Siyu Pan, Nanxi Zhang

TL;DR

A unified analytical framework is developed that disentangles and quantifies the respective roles of smoothing and perturbation and designs new first-order methods that improve the state-of-the-art iteration complexity bounds for both single-loop and double-loop schemes.

Abstract

In this paper, we investigate smooth nonconvex-concave minimax optimization problems and analyze two widely used acceleration mechanisms -- perturbation and smoothing. Perturbation augments the dual objective with a small quadratic regularization term, whereas smoothing employs an auxiliary primal sequence to approximate a proximal-point update of the value function. While both techniques are known to improve convergence guarantees, their respective roles and relative strengths remain unclear. We develop a unified analytical framework that disentangles and quantifies the respective roles of smoothing and perturbation. With this analytical framework, we design new first-order methods that improve the state-of-the-art iteration complexity bounds for both single-loop and double-loop schemes, for achieving both approximate game stationary (GS) and optimization stationary (OS) points. We also establish matching lower bounds based on carefully constructed hard instances, showing that the resulting complexity bounds are tight. Taken together, these results reveal a fundamental difference between approximate GS and OS in terms of their intrinsic complexity behavior and the following understanding: smoothing and perturbation play fundamentally different yet complementary roles in achieving approximate GS. Their combination creates a synergistic effect that yields strictly faster convergence speed than either mechanism alone, whereas perturbation by itself is insufficient for OS.

Smoothing Meets Perturbation: Unified and Tight Analysis for Nonconvex-Concave Minimax Optimization

TL;DR

A unified analytical framework is developed that disentangles and quantifies the respective roles of smoothing and perturbation and designs new first-order methods that improve the state-of-the-art iteration complexity bounds for both single-loop and double-loop schemes.

Abstract

In this paper, we investigate smooth nonconvex-concave minimax optimization problems and analyze two widely used acceleration mechanisms -- perturbation and smoothing. Perturbation augments the dual objective with a small quadratic regularization term, whereas smoothing employs an auxiliary primal sequence to approximate a proximal-point update of the value function. While both techniques are known to improve convergence guarantees, their respective roles and relative strengths remain unclear. We develop a unified analytical framework that disentangles and quantifies the respective roles of smoothing and perturbation. With this analytical framework, we design new first-order methods that improve the state-of-the-art iteration complexity bounds for both single-loop and double-loop schemes, for achieving both approximate game stationary (GS) and optimization stationary (OS) points. We also establish matching lower bounds based on carefully constructed hard instances, showing that the resulting complexity bounds are tight. Taken together, these results reveal a fundamental difference between approximate GS and OS in terms of their intrinsic complexity behavior and the following understanding: smoothing and perturbation play fundamentally different yet complementary roles in achieving approximate GS. Their combination creates a synergistic effect that yields strictly faster convergence speed than either mechanism alone, whereas perturbation by itself is insufficient for OS.
Paper Structure (39 sections, 28 theorems, 182 equations, 2 tables, 4 algorithms)

This paper contains 39 sections, 28 theorems, 182 equations, 2 tables, 4 algorithms.

Key Result

theorem 1

Suppose that l:smoothass:bound hold, and let the sequence $\{(\bm{x}_t,\bm{y}_t)\}_{t\ge 0}$ be generated by Perturbed GDA with step sizes satisfying con-pgda. Given any $\epsilon>0$, we have

Theorems & Definitions (70)

  • remark 1
  • definition 1: Stationarity points
  • definition 2: Initial gaps
  • remark 2
  • theorem 1: Iteration complexity of Perturbed GDA
  • theorem 2: Tightness analysis of Perturbed GDA
  • remark 3
  • theorem 3: Iteration complexity of Perturbed Smoothed GDA
  • remark 4
  • theorem 4: Tightness analysis of Perturbed Smoothed GDA
  • ...and 60 more