Table of Contents
Fetching ...

Error estimates between SGD with momentum and underdamped Langevin diffusion

Arnaud Guillin, Yu Wang, Lihu Xu, Haoran Yang

Abstract

Stochastic gradient descent with momentum is a popular variant of stochastic gradient descent, which has recently been reported to have a close relationship with the underdamped Langevin diffusion. In this paper, we establish a quantitative error estimate between them in the 1-Wasserstein and total variation distances.

Error estimates between SGD with momentum and underdamped Langevin diffusion

Abstract

Stochastic gradient descent with momentum is a popular variant of stochastic gradient descent, which has recently been reported to have a close relationship with the underdamped Langevin diffusion. In this paper, we establish a quantitative error estimate between them in the 1-Wasserstein and total variation distances.

Paper Structure

This paper contains 21 sections, 23 theorems, 307 equations.

Key Result

Theorem 1

Under Assumptions Assump.1, Assump.2, and Assump.3, we assume that $\eta_1 \leqslant c$ for some positive constant $c$, and $\gamma > \sqrt{2}(2L+a)/\sqrt{a}$ additionally. Then we have

Theorems & Definitions (45)

  • Theorem 1
  • Theorem 2
  • Corollary 3
  • Corollary 4
  • Corollary 5
  • Remark 6
  • Lemma 3.1
  • Lemma 3.2
  • Lemma 3.3
  • Lemma 3.4
  • ...and 35 more