Table of Contents
Fetching ...

A Primal-Dual Approach to Solving Variational Inequalities with General Constraints

Tatjana Chavdarova, Tong Yang, Matteo Pagliardini, Michael I. Jordan

TL;DR

The paper develops and analyzes a primal-dual approach (ACVI) for solving variational inequalities under general constraints, removing the need for exact subproblem solutions via a warm-started inexact variant (I-ACVI). It establishes nonasymptotic last-iterate convergence rates of $O(1/\\sqrt{K})$ for monotone VIs without assuming $L$-Lipschitzness, and shows that I-ACVI preserves this rate under suitable error decay. A projection-free specialization (P-ACVI) is introduced for simple inequality constraints, preserving the same rate, and a projection-based variant (PI-ACVI) is analyzed for efficiency. The paper provides extensive experiments on 2D and high-dimensional games and a constrained GAN on MNIST, demonstrating faster wall-clock convergence with warm-starting and illustrating practical gains over projection-based baselines. Overall, these results advance last-iterate guarantees for VI solvers with general constraints and practical, scalable algorithms for large-scale problems.

Abstract

Yang et al. (2023) recently showed how to use first-order gradient methods to solve general variational inequalities (VIs) under a limiting assumption that analytic solutions of specific subproblems are available. In this paper, we circumvent this assumption via a warm-starting technique where we solve subproblems approximately and initialize variables with the approximate solution found at the previous iteration. We prove the convergence of this method and show that the gap function of the last iterate of the method decreases at a rate of $O(\frac{1}{\sqrt{K}})$ when the operator is $L$-Lipschitz and monotone. In numerical experiments, we show that this technique can converge much faster than its exact counterpart. Furthermore, for the cases when the inequality constraints are simple, we introduce an alternative variant of ACVI and establish its convergence under the same conditions. Finally, we relax the smoothness assumptions in Yang et al., yielding, to our knowledge, the first convergence result for VIs with general constraints that does not rely on the assumption that the operator is $L$-Lipschitz.

A Primal-Dual Approach to Solving Variational Inequalities with General Constraints

TL;DR

The paper develops and analyzes a primal-dual approach (ACVI) for solving variational inequalities under general constraints, removing the need for exact subproblem solutions via a warm-started inexact variant (I-ACVI). It establishes nonasymptotic last-iterate convergence rates of for monotone VIs without assuming -Lipschitzness, and shows that I-ACVI preserves this rate under suitable error decay. A projection-free specialization (P-ACVI) is introduced for simple inequality constraints, preserving the same rate, and a projection-based variant (PI-ACVI) is analyzed for efficiency. The paper provides extensive experiments on 2D and high-dimensional games and a constrained GAN on MNIST, demonstrating faster wall-clock convergence with warm-starting and illustrating practical gains over projection-based baselines. Overall, these results advance last-iterate guarantees for VI solvers with general constraints and practical, scalable algorithms for large-scale problems.

Abstract

Yang et al. (2023) recently showed how to use first-order gradient methods to solve general variational inequalities (VIs) under a limiting assumption that analytic solutions of specific subproblems are available. In this paper, we circumvent this assumption via a warm-starting technique where we solve subproblems approximately and initialize variables with the approximate solution found at the previous iteration. We prove the convergence of this method and show that the gap function of the last iterate of the method decreases at a rate of when the operator is -Lipschitz and monotone. In numerical experiments, we show that this technique can converge much faster than its exact counterpart. Furthermore, for the cases when the inequality constraints are simple, we introduce an alternative variant of ACVI and establish its convergence under the same conditions. Finally, we relax the smoothness assumptions in Yang et al., yielding, to our knowledge, the first convergence result for VIs with general constraints that does not rely on the assumption that the operator is -Lipschitz.
Paper Structure (61 sections, 40 theorems, 280 equations, 16 figures, 2 tables, 4 algorithms)

This paper contains 61 sections, 40 theorems, 280 equations, 16 figures, 2 tables, 4 algorithms.

Key Result

Theorem 3.1

Given a continuous operator $F\colon \mathcal{X}\to {\mathbb R}^n$, assume: (i) F is monotone on $\mathcal{C}_=$, as per Def. def:monotone; (ii) either $F$ is strictly monotone on $\mathcal{C}$ or one of $\varphi_i$ is strictly convex. Let $({\bm{x}}_K^{(t)}, {\bm{y}}_K^{(t)}, {\bm{\lambda}}_K^{(t)}

Figures (16)

  • Figure 1: Convergence of ACVI and I-ACVI on the \ref{['eq:2d-bg']} problem. The central path is depicted in yellow. For all methods, we show the ${\bm{y}}$-iterates initialized at the same point (blue circle). Each subsequent point on the trajectory depicts the (exact or approximate) solution at the end of the inner loop. A yellow star represents the game's Nash equilibrium (NE), and the constraint set is the interior of the red square. (a): As we decay $\mu_t$, the solutions of the inner loop of ACVI follow the central path. As $\mu_t \rightarrow 0$, the solution of the inner loop of ACVI converges to the NE. (b, c, d): When the ${\bm{x}}$ and ${\bm{y}}$ subproblems are solved approximately with a finite $K$ and $\ell$, the iterates need not converge as the approximation error increases (and $K$ decreases). See § \ref{['sec:experiments']} for a discussion.
  • Figure 2: Intermediate iterates of PI-ACVI (Algorithm \ref{['alg:log_free_acvi']}) on the 2D minmax game \ref{['eq:2d-bg']}. The boundary of the constraint set is shown in red. (b) depicts the ${\bm{y}}_k$ (from line $7$ in Algorithm \ref{['alg:log_free_acvi']}) which we obtain through projections. In (a), each spiral corresponds to iteratively solving the ${\bm{x}}_k$ subproblem for $\ell=20$ steps (line $6$ in Algorithm \ref{['alg:log_free_acvi']}). Jointly, the trajectories of ${\bm{x}}$ and ${\bm{y}}$ illustrate the ACVI dynamics: ${\bm{x}}$ and the constrained ${\bm{y}}$ "collaborate" and converge to the same point.
  • Figure 3: Experiments on the \ref{['eq:c-gan']} game, using GDA, EG, and PI-ACVI on MNIST. All curves are averaged over $4$ seeds. (a): Frechet Inception Distance (FID, lower is better) given CPU wall-clock time. (b): Inception Score (IS, higher is better) given wall-clock time. We observe that PI-ACVI converges faster than EG and GDA for both metrics. Moreover, we see that using a large $\ell$ for the first iteration ($\ell_0$) can give a significant advantage. The two PI-ACVI curves use the same $\ell_+=20$.
  • Figure 4: Comparison between I-ACVI, (exact) ACVI, and projection-based algorithms on the \ref{['eq:high_dim_bg']} problem.(a): CPU time (in seconds) to reach a given relative error ($x$-axis), where the rotational intensity is fixed to $\eta=0.05$ in \ref{['eq:high_dim_bg']} for all methods. (b): Number of iterations to reach a relative error of $0.02$ for varying values of the rotational intensity $\eta$. We fix the maximum number of iterations to $50$. (c): joint impact of the number of inner-loop iterations $K_0$ at $t=0$ and different choices of inner-loop iterations for $K_+$ at any $t>0$ on the number of iterations needed to reach a fixed relative error of $10^{-4}$. We see that irrespective of the selection of $K_+$, I-ACVI converges fast if $K_0$ is large enough. For instance, $(K_0=130, K_+=1)$ converges faster than $(K_0=20, K_+=20)$. We fix $\ell=10$ for all the experiments, in all of (a), (b), and (c).
  • Figure 5: Complementary illustrations to those in Fig. \ref{['fig:acvi-c-bilin-y']} of the main part: depicting here the trajectories of the ${\bm{x}}$ iterates. We compare the convergence of ACVI and I-ACVI with different parameters on the \ref{['eq:2d-bg']} problem while also depicting the central path (shown in yellow). Each subsequent bullet on the trajectory depicts the (exact or approximate) solution at the end of the inner loop (when $k \equiv K-1$). The Nash equilibrium (NE) of the game is represented by a yellow star, and the constraint set is the interior of the red square.
  • ...and 11 more figures

Theorems & Definitions (75)

  • Definition 2.1: monotone operators
  • Definition 2.2: gap function
  • Definition 2.3: $\sigma$-approximate solution
  • Definition 2.4: $\varepsilon$-minimizer
  • Theorem 3.1: Last iterate convergence rate of ACVI---Algorithm $1$ in yang2023acvi
  • Theorem 3.2: Last iterate convergence rate of Inexact ACVI---Algorithm \ref{['alg:inexact_acvi']} with $\wp_1$ or $\wp_2$
  • Theorem 4.1: Last iterate convergence rate of P-ACVI---Algorithm \ref{['alg:log_free_acvi']}
  • Remark 4.2
  • Definition A.1: $L$-Lipschitz operator
  • Definition A.2: $\frac{1}{\mu}$-cocoercive operator
  • ...and 65 more