Table of Contents
Fetching ...

A Low-Rank Symplectic Gradient Adjustment Method for Computing Nash Equilibria

Nadja Vater, Katherine Rossella Foglia, Vittorio Colao, Alfio Borzì

TL;DR

This work addresses efficient computation of Nash equilibria in two-player games by leveraging second-order information through symplectic gradient adjustment (SGA) and its low-rank variant (LRSGA). It provides a rigorous convergence framework based on monotone operator theory, establishing contraction properties and linear convergence for SGA in suitable regimes and proving local convergence for LRSGA under accurate cross-derivative approximations. The LRSGA method achieves large computational gains via rank-one updates to mixed-derivative blocks, reducing per-iteration cost while maintaining precision, as demonstrated in CLIP-model training where CPU time and environmental impact are dramatically lowered. The combination of theoretical guarantees and practical validation indicates that LRSGA is a scalable and environmentally friendlier alternative for solving Nash equilibria in large-scale, neural-network–based game settings.

Abstract

This work presents a theoretical and numerical investigation of the symplectic gradient adjustment (SGA) method and of a low-rank SGA (LRSGA) method for efficiently solving two-objective optimization problems in the framework of Nash games. The SGA method outperforms the gradient method by including second-order mixed derivatives computed at each iterate, which requires considerably larger computational effort. For this reason, a LRSGA method is proposed where the approximation to second-order mixed derivatives are obtained by rank-one updates. The theoretical analysis presented in this work focuses on novel convergence estimates for the SGA and LRSGA methods, including parameter bounds. The superior computational complexity of the LRSGA method is demonstrated in the training of a CLIP neural architecture, where the LRSGA method outperforms the SGA method by orders of magnitude smaller CPU time.

A Low-Rank Symplectic Gradient Adjustment Method for Computing Nash Equilibria

TL;DR

This work addresses efficient computation of Nash equilibria in two-player games by leveraging second-order information through symplectic gradient adjustment (SGA) and its low-rank variant (LRSGA). It provides a rigorous convergence framework based on monotone operator theory, establishing contraction properties and linear convergence for SGA in suitable regimes and proving local convergence for LRSGA under accurate cross-derivative approximations. The LRSGA method achieves large computational gains via rank-one updates to mixed-derivative blocks, reducing per-iteration cost while maintaining precision, as demonstrated in CLIP-model training where CPU time and environmental impact are dramatically lowered. The combination of theoretical guarantees and practical validation indicates that LRSGA is a scalable and environmentally friendlier alternative for solving Nash equilibria in large-scale, neural-network–based game settings.

Abstract

This work presents a theoretical and numerical investigation of the symplectic gradient adjustment (SGA) method and of a low-rank SGA (LRSGA) method for efficiently solving two-objective optimization problems in the framework of Nash games. The SGA method outperforms the gradient method by including second-order mixed derivatives computed at each iterate, which requires considerably larger computational effort. For this reason, a LRSGA method is proposed where the approximation to second-order mixed derivatives are obtained by rank-one updates. The theoretical analysis presented in this work focuses on novel convergence estimates for the SGA and LRSGA methods, including parameter bounds. The superior computational complexity of the LRSGA method is demonstrated in the training of a CLIP neural architecture, where the LRSGA method outperforms the SGA method by orders of magnitude smaller CPU time.

Paper Structure

This paper contains 14 sections, 10 theorems, 59 equations, 9 figures, 2 tables.

Key Result

Theorem 2.2

Assume that $X$ and $Y$ are compact and convex subsets. Let $f$ and $g$ be continuous, and assume that the map $x \mapsto f(x,y)$ is a convex function of $x$, for each fixed $y \in Y$; further assume that the map $y \mapsto g(x,y)$ is a convex function of $y$, for each fixed $x \in X$. Then there ex

Figures (9)

  • Figure 1: Iterates of the GD method with $\eta=0.7$ (left) and $\eta=1$.
  • Figure 2: Architecture of the CLIP network used in the experiments.
  • Figure 3: Illustration of the similarity matrix between image and text embeddings for a batch of size N
  • Figure 4: Training loss curves (LossI and LossT) for LRSGA and SGA across different values of $\eta$.
  • Figure 5: Box plots of Image-to-text and text-to-image loss computed on the test set for each configuration
  • ...and 4 more figures

Theorems & Definitions (23)

  • Definition 2.1
  • Theorem 2.2
  • Definition 2.3
  • Lemma 2.4
  • proof
  • Example 3.1
  • Definition 4.1
  • Proposition 4.2
  • Lemma 4.3
  • proof
  • ...and 13 more