Table of Contents
Fetching ...

Minimax Distribution Estimation in Wasserstein Distance

Shashank Singh, Barnabás Póczos

TL;DR

This work establishes minimax rates for estimating a distribution under Wasserstein loss using only moment constraints and metric-entropy structure of the support. It provides a tight upper bound for the empirical distribution via a multi-resolution partitioning scheme and two complementary lower bounds based on packing radius and heavy-tailed moments, showing empirical distribution is often minimax-optimal. The results extend beyond totally bounded spaces to unbounded settings and connect to practical scenarios including Euclidean spaces, manifolds, and latent-variable models, with direct implications for Monte Carlo integration and Wasserstein-based learning. Overall, the paper clarifies how metric complexity and moment conditions jointly govern the speed of distribution estimation in Wasserstein distance and highlights the empirical estimator’s strong performance under broad conditions.

Abstract

The Wasserstein metric is an important measure of distance between probability distributions, with applications in machine learning, statistics, probability theory, and data analysis. This paper provides upper and lower bounds on statistical minimax rates for the problem of estimating a probability distribution under Wasserstein loss, using only metric properties, such as covering and packing numbers, of the sample space, and weak moment assumptions on the probability distributions.

Minimax Distribution Estimation in Wasserstein Distance

TL;DR

This work establishes minimax rates for estimating a distribution under Wasserstein loss using only moment constraints and metric-entropy structure of the support. It provides a tight upper bound for the empirical distribution via a multi-resolution partitioning scheme and two complementary lower bounds based on packing radius and heavy-tailed moments, showing empirical distribution is often minimax-optimal. The results extend beyond totally bounded spaces to unbounded settings and connect to practical scenarios including Euclidean spaces, manifolds, and latent-variable models, with direct implications for Monte Carlo integration and Wasserstein-based learning. Overall, the paper clarifies how metric complexity and moment conditions jointly govern the speed of distribution estimation in Wasserstein distance and highlights the empirical estimator’s strong performance under broad conditions.

Abstract

The Wasserstein metric is an important measure of distance between probability distributions, with applications in machine learning, statistics, probability theory, and data analysis. This paper provides upper and lower bounds on statistical minimax rates for the problem of estimating a probability distribution under Wasserstein loss, using only metric properties, such as covering and packing numbers, of the sample space, and weak moment assumptions on the probability distributions.

Paper Structure

This paper contains 15 sections, 14 theorems, 82 equations.

Key Result

Theorem 1

Let $x_0 \in \Omega$ and suppose $m_{\ell,x_0}(P) \in [1, \infty)$. Let $J \in \mathbb{N}$ and $\varepsilon > 0$. For each $k \in \mathbb{N}$, define $B_{2^k}(x_0) := \left\{ y \in \Omega : 2^k \leq \rho(x_0, y) < 2^{k + 1} \right\}$. Then, for $\ell \in (r, \infty)\backslash\{2r\}$, where $C_{\ell,r}$ is a constant depending only on $\ell$ and $r$. Moreover, when $\ell = 2r$, the bound ineq:uppe

Theorems & Definitions (24)

  • Theorem 1: Upper Bound
  • Theorem 2: Minimax Lower Bound in Terms of Packing Radius
  • Theorem 3: Minimax Lower Bound for Heavy-Tailed Distributions
  • Lemma 4
  • Lemma 5
  • Lemma 6
  • Lemma 7
  • Lemma 8: Theorem 1 of han2015minimax
  • Theorem 9
  • proof
  • ...and 14 more