Large deviations for interacting particle dynamics for finding mixed equilibria in zero-sum games

Viktor Nilsson; Pierre Nyquist

Large deviations for interacting particle dynamics for finding mixed equilibria in zero-sum games

Viktor Nilsson, Pierre Nyquist

TL;DR

This work analyzes mixed Nash equilibria in continuous zero-sum games through entropic-regularised, two-layer interacting particle dynamics. The authors establish a large deviation principle (LDP) for the joint empirical measures of the particle system, framed within an augmented-space (relaxed-control) formulation, and show that the limit dynamics correspond to entropy-regularised Wasserstein gradient flows for the player-populations. They prove almost-sure convergence of the particle marginals to the mean-field PDE solution and demonstrate that the Nikaidô-Isoda ($NI$) error converges accordingly; additionally, they derive an LDP for the NI-error via contraction. The results provide a rigorous link between finite-particle stochastic dynamics and infinite-particle mean-field equilibria, offering a principled basis for analyzing convergence and informing parameter choices in related training dynamics for GANs and reinforcement learning.

Abstract

Finding equilibrium points in continuous minmax games has become a key problem within machine learning, in part due to its connection to the training of generative adversarial networks and reinforcement learning. Because of existence and robustness issues, recent developments have shifted from pure equilibria to focusing on mixed equilibrium points. In this work we consider a method for finding mixed equilibria in two-layer zero-sum games based on entropic regularisation, where the two competing strategies are represented by two sets of interacting particles. We show that the sequence of empirical measures of the particle system satisfies a large deviation principle as the number of particles grows to infinity, and how this implies convergence of the empirical measure and the associated Nikaidô-Isoda error, complementing existing law of large numbers results.

Large deviations for interacting particle dynamics for finding mixed equilibria in zero-sum games

TL;DR

) error converges accordingly; additionally, they derive an LDP for the NI-error via contraction. The results provide a rigorous link between finite-particle stochastic dynamics and infinite-particle mean-field equilibria, offering a principled basis for analyzing convergence and informing parameter choices in related training dynamics for GANs and reinforcement learning.

Abstract

Paper Structure (3 sections, 4 theorems, 37 equations)

This paper contains 3 sections, 4 theorems, 37 equations.

Introduction
Definitions and model setup
An LDP and a.s.-convergence of the NI error

Key Result

Theorem 3.2

For each $n \in \mathbb N$, define $\gamma ^n = \frac{1}{n} \sum _{i=1} ^n \delta _{(X^{i,n}, Y^{i,n})} \in \mathcal{P} (C([0,T]:\mathcal{Z}))$. Assume that $\gamma ^n _0 \to \gamma _0$, for some $\gamma _0 \in \mathcal{P} (\mathcal{Z})$, as $n \to \infty$. Then, the family of empirical measures $\ where the supremum is taken over real Schwartz functions $f$ on $\mathcal{Z}$ and, for an element $

Theorems & Definitions (7)

Theorem 3.2
proof : Proof of Theorem \ref{['thm:LDP']}
Corollary 3.3
proof
Proposition 3.4
proof
Corollary 3.5

Large deviations for interacting particle dynamics for finding mixed equilibria in zero-sum games

TL;DR

Abstract

Large deviations for interacting particle dynamics for finding mixed equilibria in zero-sum games

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (7)