Table of Contents
Fetching ...

Min-Max Optimization Is Strictly Easier Than Variational Inequalities

Henry Shugart, Jason M. Altschuler

TL;DR

This work reveals a fundamental separation between convex-concave min-max optimization and variational-inequality formulations in the quadratic, unconstrained setting. By translating convergence into extremal polynomial problems and exploiting the geometry of the spectral ranges—an interval for min-max versus a half-disc in VI—the authors prove faster optimal rates for min-max: for strongly-convex-strongly-concave cases, the rate improves by a factor of roughly $3\sqrt{3}/4 \approx 1.3$, and by about $3\sqrt{3}/2 \approx 2.6$ for convex-concave cases. The analysis hinges on Green's functions and conformal mappings to bound extremal polynomials, and demonstrates that asymmetrical, min-max–direct algorithms (e.g., gradient-descent-ascent with slingshot stepsizes) surpass symmetric VI approaches. An adaptivity-extension shows the gap persists even when the algorithm can adapt to observed data, via a duality-based construction of hard instances. Overall, the results motivate designing dedicated min-max algorithms rather than relying on VI reductions, with potential impact on a broad class of saddle-point problems.

Abstract

Classically, a mainstream approach for solving a convex-concave min-max problem is to instead solve the variational inequality problem arising from its first-order optimality conditions. Is it possible to solve min-max problems faster by bypassing this reduction? This paper initiates this investigation. We show that the answer is yes in the textbook setting of unconstrained quadratic objectives: the optimal convergence rate for first-order algorithms is strictly better for min-max problems than for the corresponding variational inequalities. The key reason that min-max algorithms can be faster is that they can exploit the asymmetry of the min and max variables--a property that is lost in the reduction to variational inequalities. Central to our analyses are sharp characterizations of optimal convergence rates in terms of extremal polynomials which we compute using Green's functions and conformal mappings.

Min-Max Optimization Is Strictly Easier Than Variational Inequalities

TL;DR

This work reveals a fundamental separation between convex-concave min-max optimization and variational-inequality formulations in the quadratic, unconstrained setting. By translating convergence into extremal polynomial problems and exploiting the geometry of the spectral ranges—an interval for min-max versus a half-disc in VI—the authors prove faster optimal rates for min-max: for strongly-convex-strongly-concave cases, the rate improves by a factor of roughly , and by about for convex-concave cases. The analysis hinges on Green's functions and conformal mappings to bound extremal polynomials, and demonstrates that asymmetrical, min-max–direct algorithms (e.g., gradient-descent-ascent with slingshot stepsizes) surpass symmetric VI approaches. An adaptivity-extension shows the gap persists even when the algorithm can adapt to observed data, via a duality-based construction of hard instances. Overall, the results motivate designing dedicated min-max algorithms rather than relying on VI reductions, with potential impact on a broad class of saddle-point problems.

Abstract

Classically, a mainstream approach for solving a convex-concave min-max problem is to instead solve the variational inequality problem arising from its first-order optimality conditions. Is it possible to solve min-max problems faster by bypassing this reduction? This paper initiates this investigation. We show that the answer is yes in the textbook setting of unconstrained quadratic objectives: the optimal convergence rate for first-order algorithms is strictly better for min-max problems than for the corresponding variational inequalities. The key reason that min-max algorithms can be faster is that they can exploit the asymmetry of the min and max variables--a property that is lost in the reduction to variational inequalities. Central to our analyses are sharp characterizations of optimal convergence rates in terms of extremal polynomials which we compute using Green's functions and conformal mappings.

Paper Structure

This paper contains 22 sections, 20 theorems, 60 equations, 3 figures, 2 tables.

Key Result

Lemma 2.3

Let $\mu \geq 0$. If $f(x,y)$ is $\mu$-strongly-convex-strongly-concave, then $F=(\nabla_x f, -\nabla_y f)$ is $\mu$-strongly-monotone.

Figures (3)

  • Figure 1: The conformal mapping $\Phi_{\Omega}$ in \ref{['lem:conformal']} from the exterior $\hat{\mathbb{C}} \setminus \Omega$ of the half disc to the exterior $\hat{\mathbb{C}} \setminus D$ of the disc. In this plot, a point $\lambda \in \hat{\mathbb{C}} \setminus \Omega$ (left) is mapped to the point $\Phi(\lambda) \in \hat{C} \setminus D$ (right) of the same color.
  • Figure 2: Contour plot of Green's function $g_{\Omega}$ for the unit half disc $\Omega$.
  • Figure 3: Slingshot stepsize schedule (\ref{['def:steps-SCSC']}) for $\mu=0.1$, $L=1$, and $T=16$.

Theorems & Definitions (42)

  • Definition 2.1: (Strongly) convex-concave functions
  • Definition 2.2: (Strongly) monotone operators
  • Lemma 2.3: Convexity-concavity implies monotonicity Rockafellar_1970
  • proof
  • Definition 2.4: Spectral range
  • Lemma 2.5: Spectral range of $\mathcal{J}_{\mu}$
  • Lemma 2.6: Spectral range of $\mathcal{H}_{\mu}$
  • proof
  • Definition 2.7: Symmetric algorithms
  • Definition 2.8: Asymmetric algorithms
  • ...and 32 more