Table of Contents
Fetching ...

Symmetric Mean-field Langevin Dynamics for Distributional Minimax Problems

Juno Kim, Kakei Yamamoto, Kazusato Oko, Zhuoran Yang, Taiji Suzuki

TL;DR

This paper proposes mean-field Langevin averaged gradient (MFL-AG), a single-loop algorithm that implements gradient descent ascent in the distribution spaces with a novel weighted averaging, and establishes average-iterate convergence to the mixed Nash equilibrium.

Abstract

In this paper, we extend mean-field Langevin dynamics to minimax optimization over probability distributions for the first time with symmetric and provably convergent updates. We propose mean-field Langevin averaged gradient (MFL-AG), a single-loop algorithm that implements gradient descent ascent in the distribution spaces with a novel weighted averaging, and establish average-iterate convergence to the mixed Nash equilibrium. We also study both time and particle discretization regimes and prove a new uniform-in-time propagation of chaos result which accounts for the dependency of the particle interactions on all previous distributions. Furthermore, we propose mean-field Langevin anchored best response (MFL-ABR), a symmetric double-loop algorithm based on best response dynamics with linear last-iterate convergence. Finally, we study applications to zero-sum Markov games and conduct simulations demonstrating long-term optimality.

Symmetric Mean-field Langevin Dynamics for Distributional Minimax Problems

TL;DR

This paper proposes mean-field Langevin averaged gradient (MFL-AG), a single-loop algorithm that implements gradient descent ascent in the distribution spaces with a novel weighted averaging, and establishes average-iterate convergence to the mixed Nash equilibrium.

Abstract

In this paper, we extend mean-field Langevin dynamics to minimax optimization over probability distributions for the first time with symmetric and provably convergent updates. We propose mean-field Langevin averaged gradient (MFL-AG), a single-loop algorithm that implements gradient descent ascent in the distribution spaces with a novel weighted averaging, and establish average-iterate convergence to the mixed Nash equilibrium. We also study both time and particle discretization regimes and prove a new uniform-in-time propagation of chaos result which accounts for the dependency of the particle interactions on all previous distributions. Furthermore, we propose mean-field Langevin anchored best response (MFL-ABR), a symmetric double-loop algorithm based on best response dynamics with linear last-iterate convergence. Finally, we study applications to zero-sum Markov games and conduct simulations demonstrating long-term optimality.
Paper Structure (33 sections, 35 theorems, 194 equations, 1 figure, 2 algorithms)

This paper contains 33 sections, 35 theorems, 194 equations, 1 figure, 2 algorithms.

Key Result

Proposition 2.1

Under Assumptions ass-rhoreg and ass-Lreg, the solution $(\mu^*,\nu^*)$ to eq-problemstatement uniquely exists and satisfies the first-order equations

Figures (1)

  • Figure 1: Density evolution of (a) MFL-AG, (b) MFL-ABR, and (c) MFL-DA every 100 epochs. (d) Convergence speed measured in $W_1$ distance. (e) Optimality comparison via 3-point NI error.

Theorems & Definitions (61)

  • Proposition 2.1: Existence and uniqueness of MNE
  • Proposition 3.1: Well-definedness of MFL-AG flow
  • Proposition 3.2
  • Proposition 3.3: Proximal convergence of MFL-AG flow
  • Theorem 3.4: Average-iterate convergence of MFL-AG flow
  • Lemma 3.5: Entropy sandwich lower bound
  • Proposition 3.6
  • Theorem 3.7: Convergence of discretized MFL-AG
  • Theorem 4.1: Convergence of MFL-ABR
  • Proposition 5.1
  • ...and 51 more