Table of Contents
Fetching ...

Stability and Sample Complexity of Divergence Regularized Optimal Transport

Erhan Bayraktar, Stephan Eckstein, Xin Zhang

TL;DR

It is shown that divergence regularization can improve the corresponding convergence rate compared to unregularized optimal transport and prove upper bounds which exploit both the regularity of cost function and divergence functional, as well as the intrinsic dimension of the marginals.

Abstract

We study stability and sample complexity properties of divergence regularized optimal transport (DOT). First, we obtain quantitative stability results for optimizers of DOT measured in Wasserstein distance, which are applicable to a wide class of divergences and simultaneously improve known results for entropic optimal transport. Second, we study the case of sample complexity, where the DOT problem is approximated using empirical measures of the marginals. We show that divergence regularization can improve the corresponding convergence rate compared to unregularized optimal transport. To this end, we prove upper bounds which exploit both the regularity of cost function and divergence functional, as well as the intrinsic dimension of the marginals. Along the way, we establish regularity properties of dual optimizers of DOT, as well as general limit theorems for empirical measures with suitable classes of test functions.

Stability and Sample Complexity of Divergence Regularized Optimal Transport

TL;DR

It is shown that divergence regularization can improve the corresponding convergence rate compared to unregularized optimal transport and prove upper bounds which exploit both the regularity of cost function and divergence functional, as well as the intrinsic dimension of the marginals.

Abstract

We study stability and sample complexity properties of divergence regularized optimal transport (DOT). First, we obtain quantitative stability results for optimizers of DOT measured in Wasserstein distance, which are applicable to a wide class of divergences and simultaneously improve known results for entropic optimal transport. Second, we study the case of sample complexity, where the DOT problem is approximated using empirical measures of the marginals. We show that divergence regularization can improve the corresponding convergence rate compared to unregularized optimal transport. To this end, we prove upper bounds which exploit both the regularity of cost function and divergence functional, as well as the intrinsic dimension of the marginals. Along the way, we establish regularity properties of dual optimizers of DOT, as well as general limit theorems for empirical measures with suitable classes of test functions.
Paper Structure (14 sections, 14 theorems, 102 equations, 1 figure)

This paper contains 14 sections, 14 theorems, 102 equations, 1 figure.

Key Result

Lemma 2.1

There exists a unique $\pi^* \in \Pi(\boldsymbol \mu)$ such that Assuming boundedness of $c$, we have the dual formulation where the supremum is taken over measurable and bounded functions $h : X \rightarrow \mathbb{R}$ of the form $h(x) = \sum_{i=1}^N h_i(x_i)$.

Figures (1)

  • Figure 1: A log-scale visualization of optimizers $\pi^*$ of $\mathsf{OT}^\varepsilon_\varphi(\mu_1, \mu_2)$ for different $\varphi$ defining the regularization divergence $D_\varphi$. This example illustrates that structural properties of $\pi^*$, like the size of the support, can vary greatly depending on the type of regularization. From left to right, the respective sizes of the support of $\pi^*$ are 100 (full support), 44 and 28. The problem parameters are $\varepsilon=100$, $\mu_1 = \mu_2 = \frac{1}{10}\sum_{i=0}^9 \delta_{i/9}$ and $c(x_1, x_2) = (x_2-x_1)^2$.

Theorems & Definitions (37)

  • Lemma 2.1: Existence of primal optimizers and duality
  • proof
  • Definition 2.2: Dual regularity of $\varphi$
  • Lemma 2.3: Existence of dual optimizers
  • proof
  • Definition 2.4: Shadow
  • Definition 2.5: Weakened Lipschitz condition of $c$
  • Lemma 2.6: Continuity of $\mathsf{OT}_{\varphi}$
  • Definition 3.1: Strong convexity assumption for $\varphi$
  • Example
  • ...and 27 more