Table of Contents
Fetching ...

Linear convergence of proximal descent schemes on the Wasserstein space

Razvan-Andrei Lascu, Mateusz B. Majka, David Šiška, Łukasz Szpruch

TL;DR

This work investigates proximal descent methods, inspired by the minimizing movement scheme introduced by Jordan, Kinderlehrer and Otto, for optimizing entropy-regularized functionals on the Wasserstein space and establishes linear convergence under flat convexity assumptions, thereby relaxing the common reliance on geodesic convexity.

Abstract

We investigate proximal descent methods, inspired by the minimizing movement scheme introduced by Jordan, Kinderlehrer and Otto, for optimizing entropy-regularized functionals on the Wasserstein space. We establish linear convergence under flat convexity assumptions, thereby relaxing the common reliance on geodesic convexity. Our analysis circumvents the need for discrete-time adaptations of the Evolution Variational Inequality (EVI). Instead, we leverage a uniform logarithmic Sobolev inequality (LSI) and the entropy "sandwich" lemma, extending the analysis from arXiv:2201.10469 and arXiv:2202.01009. The major challenge in the proof via LSI is to show that the relative Fisher information $I(\cdot|π)$ is well-defined at every step of the scheme. Since the relative entropy is not Wasserstein differentiable, we prove that along the scheme the iterates belong to a certain class of Sobolev regularity, and hence the relative entropy $\operatorname{KL}(\cdot|π)$ has a unique Wasserstein sub-gradient, and that the relative Fisher information is indeed finite.

Linear convergence of proximal descent schemes on the Wasserstein space

TL;DR

This work investigates proximal descent methods, inspired by the minimizing movement scheme introduced by Jordan, Kinderlehrer and Otto, for optimizing entropy-regularized functionals on the Wasserstein space and establishes linear convergence under flat convexity assumptions, thereby relaxing the common reliance on geodesic convexity.

Abstract

We investigate proximal descent methods, inspired by the minimizing movement scheme introduced by Jordan, Kinderlehrer and Otto, for optimizing entropy-regularized functionals on the Wasserstein space. We establish linear convergence under flat convexity assumptions, thereby relaxing the common reliance on geodesic convexity. Our analysis circumvents the need for discrete-time adaptations of the Evolution Variational Inequality (EVI). Instead, we leverage a uniform logarithmic Sobolev inequality (LSI) and the entropy "sandwich" lemma, extending the analysis from arXiv:2201.10469 and arXiv:2202.01009. The major challenge in the proof via LSI is to show that the relative Fisher information is well-defined at every step of the scheme. Since the relative entropy is not Wasserstein differentiable, we prove that along the scheme the iterates belong to a certain class of Sobolev regularity, and hence the relative entropy has a unique Wasserstein sub-gradient, and that the relative Fisher information is indeed finite.

Paper Structure

This paper contains 17 sections, 23 theorems, 130 equations.

Key Result

Proposition 2.4

Let Assumption ass:lip-flat and eq:regularity-U in Assumption ass:abs-cty-pi hold. Then $F^{\sigma}$ admits a unique minimizer $\mu_{\sigma}^* \in \mathcal{P}_2^{\lambda}(\mathbb R^d)$ given by where $Z(\mu_{\sigma}^*)$ is a normalization constant.

Theorems & Definitions (43)

  • Proposition 2.4: zhenjiefict
  • Definition 2.5: Proximal Gibbs measure; Nitanda2022ConvexAOchizat2022meanfield
  • Remark 2.6
  • Lemma 2.7: Nitanda2022ConvexAOchizat2022meanfield
  • Theorem 3.1: Linear convergence of the schemes \ref{['eq:implicit-JKO']}, \ref{['eq:semi-implicit-JKO']}, \ref{['eq:proximal-JKO']}
  • Corollary 3.2: Linear convergence in $\operatorname{KL}$ and $\mathcal{W}_2^2$
  • Example 3.3: Two-layer mean-field neural network; 10.1214/20-AIHP1140
  • Theorem 5.1: Existence and uniqueness of minimizer for \ref{['eq:implicit-JKO']}
  • proof
  • Corollary 5.2: Existence of optimal transport maps along \ref{['eq:implicit-JKO']}
  • ...and 33 more