Table of Contents
Fetching ...

The fast rate of convergence of the smooth adapted Wasserstein distance

Martin Larsson, Jonghwa Park, Johannes Wiesel

TL;DR

This work tackles the slow convergence of empirical path-space distributions under the adapted Wasserstein distance by incorporating Gaussian smoothing, yielding a dimension-free fast rate. The authors show that for subgaussian $\mu$ and smoothing level $\sigma$, the smooth adapted Wasserstein distance satisfies $\mathbb{E}[\mathcal{A}\mathcal{W}^{(\sigma)}_p(\hat{\mu}_n, \mu)] \le C/\sqrt{n}$, by proving that smoothed measures have locally Lipschitz kernels and leveraging a dynamic programming principle together with empirical-process theory. Consequently, the ordinary Wasserstein distance inherits improved rates via $\mathcal{W}_p \le C\mathcal{A}\mathcal{W}_p$, under the same moment conditions, and the results extend to practical testing/statistical tasks such as SMPD-based martingale tests. The findings provide dimension-robust guarantees for time-dependent probability measures, with explicit dependence on $p$, $T$, $d$, $\sigma$, and exponential moment parameters. This advances the understanding of how smoothing interacts with adapted structures to overcome the COD in path-space estimation and related decision problems.

Abstract

Estimating a $d$-dimensional distribution $μ$ by the empirical measure $\hatμ_n$ of its samples is an important task in probability theory, statistics and machine learning. It is well known that $\mathbb{E}[\mathcal{W}_p(\hatμ_n, μ)]\lesssim n^{-1/d}$ for $d>2p$, where $\mathcal{W}_p$ denotes the $p$-Wasserstein metric. An effective tool to combat this curse of dimensionality is the smooth Wasserstein distance $\mathcal{W}^{(σ)}_p$, which measures the distance between two probability measures after having convolved them with isotropic Gaussian noise $\mathcal{N}(0,σ^2\text{I})$. In this paper we apply this smoothing technique to the adapted Wasserstein distance. We show that the smooth adapted Wasserstein distance $\mathcal{A}\mathcal{W}_p^{(σ)}$ achieves the fast rate of convergence $\mathbb{E}[\mathcal{A}\mathcal{W}_p^{(σ)}(\hatμ_n, μ)]\lesssim n^{-1/2}$, if $μ$ is subgaussian. This result follows from the surprising fact, that any subgaussian measure $μ$ convolved with a Gaussian distribution has locally Lipschitz kernels.

The fast rate of convergence of the smooth adapted Wasserstein distance

TL;DR

This work tackles the slow convergence of empirical path-space distributions under the adapted Wasserstein distance by incorporating Gaussian smoothing, yielding a dimension-free fast rate. The authors show that for subgaussian and smoothing level , the smooth adapted Wasserstein distance satisfies , by proving that smoothed measures have locally Lipschitz kernels and leveraging a dynamic programming principle together with empirical-process theory. Consequently, the ordinary Wasserstein distance inherits improved rates via , under the same moment conditions, and the results extend to practical testing/statistical tasks such as SMPD-based martingale tests. The findings provide dimension-robust guarantees for time-dependent probability measures, with explicit dependence on , , , , and exponential moment parameters. This advances the understanding of how smoothing interacts with adapted structures to overcome the COD in path-space estimation and related decision problems.

Abstract

Estimating a -dimensional distribution by the empirical measure of its samples is an important task in probability theory, statistics and machine learning. It is well known that for , where denotes the -Wasserstein metric. An effective tool to combat this curse of dimensionality is the smooth Wasserstein distance , which measures the distance between two probability measures after having convolved them with isotropic Gaussian noise . In this paper we apply this smoothing technique to the adapted Wasserstein distance. We show that the smooth adapted Wasserstein distance achieves the fast rate of convergence , if is subgaussian. This result follows from the surprising fact, that any subgaussian measure convolved with a Gaussian distribution has locally Lipschitz kernels.

Paper Structure

This paper contains 12 sections, 15 theorems, 138 equations, 1 figure.

Key Result

Theorem 4

Let $1<p<\infty$ and $\sigma>0$. Suppose that $\mu$ is a probability measure on $(\mathbb{R}^d)^T$, where $d\ge 1$ and $T\ge 2$, such that $\int e^{q\left\vert x\right\vert^2/(2\sigma^2)}\mu(dx)<\infty$ for $q>8p(2p-1)(T+9)$. Then there exists a constant $C>0$ that depends only on $p, q, d, T, \sigm

Figures (1)

  • Figure 1: $\mu=\frac{1}{2}\delta_{(0,1)}+\frac{1}{2}\delta_{(0,-1)}$ on the left and $\mu_{\varepsilon}=\frac{1}{2}\delta_{(\varepsilon, 1)}+\frac{1}{2}\delta_{(-\varepsilon, -1)}$ on the right.

Theorems & Definitions (32)

  • Definition 1: Smooth Wasserstein distance
  • Definition 2: Bicausal coupling
  • Definition 3: The adapted Wasserstein distance
  • Theorem 4: Fast rate
  • Proposition 5: Smoothed measures have Lipschitz kernels; exact statement in Proposition \ref{['prop:lipkernel']}
  • Proposition 6: Kernels of a smoothed measure
  • Proposition 7: Proposition $2.1$ in goldfeld2024limit
  • Remark 8
  • Lemma 9
  • proof
  • ...and 22 more