Estimating causal distances with non-causal ones
Beatrice Acciaio, Songyan Hou, Gudmund Pammer
TL;DR
The paper addresses the challenge of estimating the adapted Wasserstein distance $AW$ for dynamic stochastic processes by deriving sharp upper bounds in terms of classic distances. It establishes a chain of inequalities $AW_p^p \lesssim ATV_p^p \lesssim TV_p^p$, and then connects these to the classical $W_1$ distance under Sobolev regularity, with the exponent $k/(k+1)$ shown to be optimal. A fast-rate kernel-based estimator for $AW_1$ is developed, with convergence improving as the underlying densities gain smoothness, approaching the Monte Carlo rate as $k\to\infty$. The results hinge on a representation of TV and ATV via density processes and a dynamic-programming-based analysis that yields sharp constants depending linearly on the time horizon. Collectively, these contributions reduce the practical estimation of $AW$ to the more tractable estimation of $W$ and TV-type quantities in the presence of smooth densities, enabling faster nonparametric inference for dynamic stochastic optimization problems.
Abstract
The adapted Wasserstein ($AW$) distance refines the classical Wasserstein ($W$) distance by incorporating the temporal structure of stochastic processes. This makes the $AW$-distance well-suited as a robust distance for many dynamic stochastic optimization problems where the classical $W$-distance fails. However, estimating the $AW$-distance is a notably challenging task, compared to the classical $W$-distance. In the present work, we build a sharp estimate for the $AW$-distance in terms of the $W$-distance, for smooth measures. This reduces estimating the $AW$-distance to estimating the $W$-distance, where many well-established classical results can be leveraged. As an application, we prove a fast convergence rate of the kernel-based empirical estimator under the $AW$-distance, which approaches the Monte-Carlo rate ($n^{-1/2}$) in the regime of highly regular densities. These results are accomplished by deriving a sharp bi-Lipschitz estimate of the adapted total variation distance by the classical total variation distance.
