Table of Contents
Fetching ...

Estimating causal distances with non-causal ones

Beatrice Acciaio, Songyan Hou, Gudmund Pammer

TL;DR

The paper addresses the challenge of estimating the adapted Wasserstein distance $AW$ for dynamic stochastic processes by deriving sharp upper bounds in terms of classic distances. It establishes a chain of inequalities $AW_p^p \lesssim ATV_p^p \lesssim TV_p^p$, and then connects these to the classical $W_1$ distance under Sobolev regularity, with the exponent $k/(k+1)$ shown to be optimal. A fast-rate kernel-based estimator for $AW_1$ is developed, with convergence improving as the underlying densities gain smoothness, approaching the Monte Carlo rate as $k\to\infty$. The results hinge on a representation of TV and ATV via density processes and a dynamic-programming-based analysis that yields sharp constants depending linearly on the time horizon. Collectively, these contributions reduce the practical estimation of $AW$ to the more tractable estimation of $W$ and TV-type quantities in the presence of smooth densities, enabling faster nonparametric inference for dynamic stochastic optimization problems.

Abstract

The adapted Wasserstein ($AW$) distance refines the classical Wasserstein ($W$) distance by incorporating the temporal structure of stochastic processes. This makes the $AW$-distance well-suited as a robust distance for many dynamic stochastic optimization problems where the classical $W$-distance fails. However, estimating the $AW$-distance is a notably challenging task, compared to the classical $W$-distance. In the present work, we build a sharp estimate for the $AW$-distance in terms of the $W$-distance, for smooth measures. This reduces estimating the $AW$-distance to estimating the $W$-distance, where many well-established classical results can be leveraged. As an application, we prove a fast convergence rate of the kernel-based empirical estimator under the $AW$-distance, which approaches the Monte-Carlo rate ($n^{-1/2}$) in the regime of highly regular densities. These results are accomplished by deriving a sharp bi-Lipschitz estimate of the adapted total variation distance by the classical total variation distance.

Estimating causal distances with non-causal ones

TL;DR

The paper addresses the challenge of estimating the adapted Wasserstein distance for dynamic stochastic processes by deriving sharp upper bounds in terms of classic distances. It establishes a chain of inequalities , and then connects these to the classical distance under Sobolev regularity, with the exponent shown to be optimal. A fast-rate kernel-based estimator for is developed, with convergence improving as the underlying densities gain smoothness, approaching the Monte Carlo rate as . The results hinge on a representation of TV and ATV via density processes and a dynamic-programming-based analysis that yields sharp constants depending linearly on the time horizon. Collectively, these contributions reduce the practical estimation of to the more tractable estimation of and TV-type quantities in the presence of smooth densities, enabling faster nonparametric inference for dynamic stochastic optimization problems.

Abstract

The adapted Wasserstein () distance refines the classical Wasserstein () distance by incorporating the temporal structure of stochastic processes. This makes the -distance well-suited as a robust distance for many dynamic stochastic optimization problems where the classical -distance fails. However, estimating the -distance is a notably challenging task, compared to the classical -distance. In the present work, we build a sharp estimate for the -distance in terms of the -distance, for smooth measures. This reduces estimating the -distance to estimating the -distance, where many well-established classical results can be leveraged. As an application, we prove a fast convergence rate of the kernel-based empirical estimator under the -distance, which approaches the Monte-Carlo rate () in the regime of highly regular densities. These results are accomplished by deriving a sharp bi-Lipschitz estimate of the adapted total variation distance by the classical total variation distance.

Paper Structure

This paper contains 15 sections, 13 theorems, 157 equations, 4 figures.

Key Result

Theorem 2.5

Let $\mu,\nu \in \mathcal{P}(\mathbb{R}^{dT})$ and $w$ be a weighting function such that $(\nu,w)$ satisfies Assumption ass:conditional.moments and $\int w(x) \, d(\mu + \nu) < \infty$. Then we have where $(c_t)_{t = 2}^T$ are the constants in eq:ass.cond.moments.1. In particular, Moreover, this bound is sharp, i.e., given any $c_t \geq 0$, $t=2,\dots,T$, we can construct $w$ s.t. there is a seq

Figures (4)

  • Figure 1: Visualization of $f(0,\cdot)$ and $g(0,\cdot)$.
  • Figure 2: Tree representation of the distributions $\gamma_1$ (top) and $\gamma_2$ (bottom) in Example \ref{['ex:boundissharp']}, for $T=5$. Edge labels indicate transition probabilities between nodes.
  • Figure 3: Visualization of $\mu^\epsilon$ and $\nu^\epsilon$ in Example \ref{['ex:ct']}.
  • Figure 4: Density of $p_\nu$ in Example \ref{['ex:orderissharp']}.

Theorems & Definitions (38)

  • Definition 2.1
  • Definition 2.2
  • Definition 2.3
  • Theorem 2.5
  • Remark 2.6
  • Corollary 2.7
  • Definition 2.8
  • Definition 2.9
  • Theorem 2.10
  • Definition 2.11: Besov space
  • ...and 28 more