Dissipative Gradient Descent Ascent Method: A Control Theory Inspired Algorithm for Min-max Optimization

Tianqi Zheng; Nicolas Loizou; Pengcheng You; Enrique Mallada

Dissipative Gradient Descent Ascent Method: A Control Theory Inspired Algorithm for Min-max Optimization

Tianqi Zheng, Nicolas Loizou, Pengcheng You, Enrique Mallada

TL;DR

The proposed Dissipative GDA method can be seen as performing standard GDA on a state-augmented and regularized saddle function that does not strictly introduce additional convexity/concavity, and it is demonstrated that DGDA surpasses these methods, achieving superior convergence rates.

Abstract

Gradient Descent Ascent (GDA) methods for min-max optimization problems typically produce oscillatory behavior that can lead to instability, e.g., in bilinear settings. To address this problem, we introduce a dissipation term into the GDA updates to dampen these oscillations. The proposed Dissipative GDA (DGDA) method can be seen as performing standard GDA on a state-augmented and regularized saddle function that does not strictly introduce additional convexity/concavity. We theoretically show the linear convergence of DGDA in the bilinear and strongly convex-strongly concave settings and assess its performance by comparing DGDA with other methods such as GDA, Extra-Gradient (EG), and Optimistic GDA. Our findings demonstrate that DGDA surpasses these methods, achieving superior convergence rates. We support our claims with two numerical examples that showcase DGDA's effectiveness in solving saddle point problems.

Dissipative Gradient Descent Ascent Method: A Control Theory Inspired Algorithm for Min-max Optimization

TL;DR

Abstract

Paper Structure (17 sections, 5 theorems, 54 equations, 4 figures, 2 tables)

This paper contains 17 sections, 5 theorems, 54 equations, 4 figures, 2 tables.

Introduction
Problem Formulation
Dissipative Gradient Descent Ascent Algorithm
Control Theory-Based Motivation
Dissipative GDA Algorithm
Key Properties and Related Algorithms
Convergence Analysis
Convergence Analysis for Bilinear Functions
Convergence Analysis for Strongly Convex Stronly Concave Functions
Numerical Experiments
Bilinear problem
Strongly convex-strongly concave problem
Conclusion and Future Work
First Order Algorithms for Saddle Point Problems
Proof of Theorem \ref{['thm: Bilinear linear convergence']}
...and 2 more sections

Key Result

Lemma 1

you2021saddle For problem eq: min-max optimization problem, a point $( x^*, y^*)$ is a saddle point of $f( x, y)$ if and only if $( x^*, y^*,\hat{ x}^*,\hat{ y}^*)$ is a saddle point of $f( x, y, \hat{ x}, \hat{ y})$, with $\hat{ x}^* = x^*$ and $\hat{ y}^* = y^*$.

Figures (4)

Figure 1: Trajectories of states for GDA and DGDA for the simple bilinear objective function $f(x,y):=xy$.
Figure 2: Convergence of GDA, EG, OGDA, and DGDA in terms of the number of gradient evaluations for the bilinear problem. GDA diverges and the error is not shown. All other three algorithms converge linearly, where the DGDA method provides the best performance.
Figure 3: Trajectories of GDA, EG, OGDA, and DGDA for a 2d bilinear problem. GDA diverges and all other three algorithms converge linearly, where the DGDA method provides the best performance.
Figure 4: Convergence of GDA, EG, OGDA, and DGDA in terms of the number of gradient evaluations for problem \ref{['eq: Numerical Str']}. All algorithms converge linearly, and the DGDA method has the best performance.

Theorems & Definitions (10)

Definition 1: Saddle Point
Definition 2: Strongly Convex
Definition 3: $L$-Lipschitz
Lemma 1: Saddle Point Invariance
Theorem 2
Theorem 3
Corollary 4: SCSC, comparison with known rates
Remark 1
Remark 2
Theorem 5

Dissipative Gradient Descent Ascent Method: A Control Theory Inspired Algorithm for Min-max Optimization

TL;DR

Abstract

Dissipative Gradient Descent Ascent Method: A Control Theory Inspired Algorithm for Min-max Optimization

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (10)