The fast rate of convergence of the smooth adapted Wasserstein distance
Martin Larsson, Jonghwa Park, Johannes Wiesel
TL;DR
This work tackles the slow convergence of empirical path-space distributions under the adapted Wasserstein distance by incorporating Gaussian smoothing, yielding a dimension-free fast rate. The authors show that for subgaussian $\mu$ and smoothing level $\sigma$, the smooth adapted Wasserstein distance satisfies $\mathbb{E}[\mathcal{A}\mathcal{W}^{(\sigma)}_p(\hat{\mu}_n, \mu)] \le C/\sqrt{n}$, by proving that smoothed measures have locally Lipschitz kernels and leveraging a dynamic programming principle together with empirical-process theory. Consequently, the ordinary Wasserstein distance inherits improved rates via $\mathcal{W}_p \le C\mathcal{A}\mathcal{W}_p$, under the same moment conditions, and the results extend to practical testing/statistical tasks such as SMPD-based martingale tests. The findings provide dimension-robust guarantees for time-dependent probability measures, with explicit dependence on $p$, $T$, $d$, $\sigma$, and exponential moment parameters. This advances the understanding of how smoothing interacts with adapted structures to overcome the COD in path-space estimation and related decision problems.
Abstract
Estimating a $d$-dimensional distribution $μ$ by the empirical measure $\hatμ_n$ of its samples is an important task in probability theory, statistics and machine learning. It is well known that $\mathbb{E}[\mathcal{W}_p(\hatμ_n, μ)]\lesssim n^{-1/d}$ for $d>2p$, where $\mathcal{W}_p$ denotes the $p$-Wasserstein metric. An effective tool to combat this curse of dimensionality is the smooth Wasserstein distance $\mathcal{W}^{(σ)}_p$, which measures the distance between two probability measures after having convolved them with isotropic Gaussian noise $\mathcal{N}(0,σ^2\text{I})$. In this paper we apply this smoothing technique to the adapted Wasserstein distance. We show that the smooth adapted Wasserstein distance $\mathcal{A}\mathcal{W}_p^{(σ)}$ achieves the fast rate of convergence $\mathbb{E}[\mathcal{A}\mathcal{W}_p^{(σ)}(\hatμ_n, μ)]\lesssim n^{-1/2}$, if $μ$ is subgaussian. This result follows from the surprising fact, that any subgaussian measure $μ$ convolved with a Gaussian distribution has locally Lipschitz kernels.
