Table of Contents
Fetching ...

Mean-field underdamped Langevin dynamics and its spacetime discretization

Qiang Fu, Ashia Wilson

TL;DR

This work proposes a new method called the N-particle underdamped Langevin algorithm for optimizing a special class of non-linear functionals defined over the space of probability measures, based on a novel spacetime discretization of the mean-field underdamped Langevin dynamics.

Abstract

We propose a new method called the N-particle underdamped Langevin algorithm for optimizing a special class of non-linear functionals defined over the space of probability measures. Examples of problems with this formulation include training mean-field neural networks, maximum mean discrepancy minimization and kernel Stein discrepancy minimization. Our algorithm is based on a novel spacetime discretization of the mean-field underdamped Langevin dynamics, for which we provide a new, fast mixing guarantee. In addition, we demonstrate that our algorithm converges globally in total variation distance, bridging the theoretical gap between the dynamics and its practical implementation.

Mean-field underdamped Langevin dynamics and its spacetime discretization

TL;DR

This work proposes a new method called the N-particle underdamped Langevin algorithm for optimizing a special class of non-linear functionals defined over the space of probability measures, based on a novel spacetime discretization of the mean-field underdamped Langevin dynamics.

Abstract

We propose a new method called the N-particle underdamped Langevin algorithm for optimizing a special class of non-linear functionals defined over the space of probability measures. Examples of problems with this formulation include training mean-field neural networks, maximum mean discrepancy minimization and kernel Stein discrepancy minimization. Our algorithm is based on a novel spacetime discretization of the mean-field underdamped Langevin dynamics, for which we provide a new, fast mixing guarantee. In addition, we demonstrate that our algorithm converges globally in total variation distance, bridging the theoretical gap between the dynamics and its practical implementation.
Paper Structure (47 sections, 14 theorems, 169 equations, 2 figures, 2 tables, 1 algorithm)

This paper contains 47 sections, 14 theorems, 169 equations, 2 figures, 2 tables, 1 algorithm.

Key Result

Theorem 3.1

If Assumptions convexity-LSI of PGD hold, $\mu_0$ has finite second moment, finite entropy and finite Fisher information, then the law $\mu_t$ of the UMFLD with $\gamma=\sqrt{\mathscr{L}}$ and $\mathcal{E}$ defined in eq:Lyap1 satisfy,

Figures (2)

  • Figure 1: Evaluation on N-ULA, N-LA and EM-N-ULA with different number of particles N where x-axis represents the training epochs and y-axis represents the value of $\frac{1}{2n}\sum_{i=1}^n(\frac{1}{N}\sum_{s=1}^Nh(x^s;a_i)-f(a_i))^2$. Our method often enjoys better performance in the high particle-approximation regime which is consistent with our theoretical findings.
  • Figure 2: NULA with different number of particles

Theorems & Definitions (21)

  • Definition 1: LSI
  • Definition 2
  • Theorem 3.1: Mean-field underdamped Langevin dynamics
  • Theorem 3.2: N-particle underdamped Langevin dynamics
  • Theorem 3.3: Mean-field underdamped Langevin algorithm
  • Theorem 3.4: N-particle underdamped Langevin algorithm
  • Lemma 1
  • proof
  • Lemma 2
  • proof
  • ...and 11 more