Table of Contents
Fetching ...

Wasserstein GANs are Minimax Optimal Distribution Estimators

Arthur Stéphanovitch, Eddie Aamari, Clément Levrard

TL;DR

The paper establishes non-asymptotic, minimax-optimal rates for Wasserstein GAN estimators when the target distribution is a Hölder-smooth pushforward of a latent uniform distribution, achieving rates of $O(n^{-(β+γ)/(2β+d)}\vee n^{-1/2})$ up to log factors under Hölder IPMs. It introduces a sharp interpolation inequality between Hölder IPMs on manifolds, enabling uniform rates across discriminator smoothness $\gamma$ and supporting a tractable GAN estimator in the manifold setting. The work develops a wavelet-based framework to describe regularity and to construct generator and discriminator classes, proving minimax optimality across three models: a general low-dimensional model with theoretical discriminators, a full-dimensional density-based model with tractable discriminators, and a manifold-based model that combines tractability with minimax optimality. Together, these results provide a principled foundation for why WGANs can achieve minimax-optimal distribution estimation in complex, structured data settings. The findings underscore the role of regularity and manifold structure and offer practical pathways to implementable, minimax-achieving GAN estimators via wavelet-inspired neural architectures.

Abstract

We provide non asymptotic rates of convergence of the Wasserstein Generative Adversarial networks (WGAN) estimator. We build neural networks classes representing the generators and discriminators which yield a GAN that achieves the minimax optimal rate for estimating a certain probability measure $μ$ with support in $\mathbb{R}^p$. The probability $μ$ is considered to be the push forward of the Lebesgue measure on the $d$-dimensional torus $\mathbb{T}^d$ by a map $g^\star:\mathbb{T}^d\rightarrow \mathbb{R}^p$ of smoothness $β+1$. Measuring the error with the $γ$-Hölder Integral Probability Metric (IPM), we obtain up to logarithmic factors, the minimax optimal rate $O(n^{-\frac{β+γ}{2β+d}}\vee n^{-\frac{1}{2}})$ where $n$ is the sample size, $β$ determines the smoothness of the target measure $μ$, $γ$ is the smoothness of the IPM ($γ=1$ is the Wasserstein case) and $d\leq p$ is the intrinsic dimension of $μ$. In the process, we derive a sharp interpolation inequality between Hölder IPMs. This novel result of theory of functions spaces generalizes classical interpolation inequalities to the case where the measures involved have densities on different manifolds.

Wasserstein GANs are Minimax Optimal Distribution Estimators

TL;DR

The paper establishes non-asymptotic, minimax-optimal rates for Wasserstein GAN estimators when the target distribution is a Hölder-smooth pushforward of a latent uniform distribution, achieving rates of up to log factors under Hölder IPMs. It introduces a sharp interpolation inequality between Hölder IPMs on manifolds, enabling uniform rates across discriminator smoothness and supporting a tractable GAN estimator in the manifold setting. The work develops a wavelet-based framework to describe regularity and to construct generator and discriminator classes, proving minimax optimality across three models: a general low-dimensional model with theoretical discriminators, a full-dimensional density-based model with tractable discriminators, and a manifold-based model that combines tractability with minimax optimality. Together, these results provide a principled foundation for why WGANs can achieve minimax-optimal distribution estimation in complex, structured data settings. The findings underscore the role of regularity and manifold structure and offer practical pathways to implementable, minimax-achieving GAN estimators via wavelet-inspired neural architectures.

Abstract

We provide non asymptotic rates of convergence of the Wasserstein Generative Adversarial networks (WGAN) estimator. We build neural networks classes representing the generators and discriminators which yield a GAN that achieves the minimax optimal rate for estimating a certain probability measure with support in . The probability is considered to be the push forward of the Lebesgue measure on the -dimensional torus by a map of smoothness . Measuring the error with the -Hölder Integral Probability Metric (IPM), we obtain up to logarithmic factors, the minimax optimal rate where is the sample size, determines the smoothness of the target measure , is the smoothness of the IPM ( is the Wasserstein case) and is the intrinsic dimension of . In the process, we derive a sharp interpolation inequality between Hölder IPMs. This novel result of theory of functions spaces generalizes classical interpolation inequalities to the case where the measures involved have densities on different manifolds.
Paper Structure (70 sections, 53 theorems, 433 equations, 1 table)

This paper contains 70 sections, 53 theorems, 433 equations, 1 table.

Key Result

Theorem 3.1

If $g^\star \in \mathcal{H}^{\beta+1}_K(\mathbb{T}^d,\mathbb{R}^p)$ and $\mathcal{G}\subset \mathcal{H}^{\beta+1}_K(\mathbb{T}^d,\mathbb{R}^p)$, $\mathcal{D} \subset \mathcal{H}^{\gamma}_1(B^p(0,K),\mathbb{R})$, then the GAN estimator WGANS verifies

Theorems & Definitions (102)

  • Definition 2.1
  • Theorem 3.1
  • Proposition 3.2
  • Lemma 3.3
  • Definition 3.4
  • Lemma 3.5
  • Proposition 3.6
  • Proposition 3.7
  • Theorem 4.1
  • Corollary 4.2
  • ...and 92 more