Table of Contents
Fetching ...

Distributional Limit Theory for Optimal Transport

Eustasio del Barrio, Alberto González-Sanz, Jean-Michel Loubes, David Rodríguez-Vítores

TL;DR

The paper surveys distributional limit theorems for empirical optimal transport functionals, detailing one- and multi-dimensional CLTs for OT costs, maps, and potentials, including regularized and sliced variants. It emphasizes how dimensionality, regularity, and discrete/semi-discrete structures influence limiting distributions, and it presents results enabling inference such as confidence bands and hypothesis tests using Sinkhorn divergences and smooth OT. The work highlights key techniques (Hadamard differentiability, Efron–Stein linearization, Z-estimation) and how lower-complexity adaptation and semi-discrete settings can mitigate the curse of dimensionality, while also outlining open problems and potential applications in fairness and governance. Overall, the article provides a comprehensive roadmap of current distributional theory for OT and its practical implications for statistical inference in high-dimensional settings.

Abstract

Optimal Transport (OT) is a resource allocation problem with applications in biology, data science, economics and statistics, among others. In some of the applications, practitioners have access to samples which approximate the continuous measure. Hence the quantities of interest derived from OT -- plans, maps and costs -- are only available in their empirical versions. Statistical inference on OT aims at finding confidence intervals of the population plans, maps and costs. In recent years this topic gained an increasing interest in the statistical community. In this paper we provide a comprehensive review of the most influential results on this research field, underlying the some of the applications. Finally, we provide a list of open problems.

Distributional Limit Theory for Optimal Transport

TL;DR

The paper surveys distributional limit theorems for empirical optimal transport functionals, detailing one- and multi-dimensional CLTs for OT costs, maps, and potentials, including regularized and sliced variants. It emphasizes how dimensionality, regularity, and discrete/semi-discrete structures influence limiting distributions, and it presents results enabling inference such as confidence bands and hypothesis tests using Sinkhorn divergences and smooth OT. The work highlights key techniques (Hadamard differentiability, Efron–Stein linearization, Z-estimation) and how lower-complexity adaptation and semi-discrete settings can mitigate the curse of dimensionality, while also outlining open problems and potential applications in fairness and governance. Overall, the article provides a comprehensive roadmap of current distributional theory for OT and its practical implications for statistical inference in high-dimensional settings.

Abstract

Optimal Transport (OT) is a resource allocation problem with applications in biology, data science, economics and statistics, among others. In some of the applications, practitioners have access to samples which approximate the continuous measure. Hence the quantities of interest derived from OT -- plans, maps and costs -- are only available in their empirical versions. Statistical inference on OT aims at finding confidence intervals of the population plans, maps and costs. In recent years this topic gained an increasing interest in the statistical community. In this paper we provide a comprehensive review of the most influential results on this research field, underlying the some of the applications. Finally, we provide a list of open problems.

Paper Structure

This paper contains 27 sections, 21 theorems, 144 equations.

Key Result

Theorem 2.1

Assume $d=1$. If $p>1$, $P$ and $Q$ have finite moments of order $2p$ and $Q$ has a continuous quantile function then where $\sigma_p^2(P,Q)>0$, defined in limiting_variance_1d, is strictly positive unless $P=Q$ or $P$ is Dirac's measure on a point. where and $B$ is a standard Brownian bridge on $[0,1]$. The distribution of $\gamma(P,Q)$ is Gaussian if $\ell(F=G)=0$.

Theorems & Definitions (26)

  • Theorem 2.1
  • Theorem 2.2
  • Theorem 2.3
  • Theorem 2.4
  • Theorem 3.1
  • Theorem 3.2
  • Theorem 3.3
  • Theorem 3.4
  • Theorem 3.5
  • Theorem 3.6
  • ...and 16 more