Mean-field underdamped Langevin dynamics and its spacetime discretization

Qiang Fu; Ashia Wilson

Mean-field underdamped Langevin dynamics and its spacetime discretization

Qiang Fu, Ashia Wilson

TL;DR

This work proposes a new method called the N-particle underdamped Langevin algorithm for optimizing a special class of non-linear functionals defined over the space of probability measures, based on a novel spacetime discretization of the mean-field underdamped Langevin dynamics.

Abstract

We propose a new method called the N-particle underdamped Langevin algorithm for optimizing a special class of non-linear functionals defined over the space of probability measures. Examples of problems with this formulation include training mean-field neural networks, maximum mean discrepancy minimization and kernel Stein discrepancy minimization. Our algorithm is based on a novel spacetime discretization of the mean-field underdamped Langevin dynamics, for which we provide a new, fast mixing guarantee. In addition, we demonstrate that our algorithm converges globally in total variation distance, bridging the theoretical gap between the dynamics and its practical implementation.

Mean-field underdamped Langevin dynamics and its spacetime discretization

TL;DR

Abstract

Paper Structure (47 sections, 14 theorems, 169 equations, 2 figures, 2 tables, 1 algorithm)

This paper contains 47 sections, 14 theorems, 169 equations, 2 figures, 2 tables, 1 algorithm.

Introduction
Organization
Preliminaries
Notation
Background
Definitions and assumptions
Related work
N-particle underdamped Langevin algorithm
Convergence analysis
Proof sketches
Discussion of mixing time results
Applications of Algorithm \ref{['UNLA']}
Training mean-field neural networks
Density estimation via MMD minimization
Kernel Stein discrepancy minimization
...and 32 more sections

Key Result

Theorem 3.1

If Assumptions convexity-LSI of PGD hold, $\mu_0$ has finite second moment, finite entropy and finite Fisher information, then the law $\mu_t$ of the UMFLD with $\gamma=\sqrt{\mathscr{L}}$ and $\mathcal{E}$ defined in eq:Lyap1 satisfy,

Figures (2)

Figure 1: Evaluation on N-ULA, N-LA and EM-N-ULA with different number of particles N where x-axis represents the training epochs and y-axis represents the value of $\frac{1}{2n}\sum_{i=1}^n(\frac{1}{N}\sum_{s=1}^Nh(x^s;a_i)-f(a_i))^2$. Our method often enjoys better performance in the high particle-approximation regime which is consistent with our theoretical findings.
Figure 2: NULA with different number of particles

Theorems & Definitions (21)

Definition 1: LSI
Definition 2
Theorem 3.1: Mean-field underdamped Langevin dynamics
Theorem 3.2: N-particle underdamped Langevin dynamics
Theorem 3.3: Mean-field underdamped Langevin algorithm
Theorem 3.4: N-particle underdamped Langevin algorithm
Lemma 1
proof
Lemma 2
proof
...and 11 more

Mean-field underdamped Langevin dynamics and its spacetime discretization

TL;DR

Abstract

Mean-field underdamped Langevin dynamics and its spacetime discretization

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (21)