Table of Contents
Fetching ...

An Approximation Theory Framework for Measure-Transport Sampling Algorithms

Ricardo Baptista, Bamdad Hosseini, Nikola B. Kovachki, Youssef M. Marzouk, Amir Sagiv

TL;DR

A general approximation-theoretic framework to analyze measure transport algorithms for probabilistic modeling, and the development of new stability estimates that relate the distance between two maps to the distance (or divergence) between the pushforward measures they define.

Abstract

This article presents a general approximation-theoretic framework to analyze measure transport algorithms for probabilistic modeling. A primary motivating application for such algorithms is sampling -- a central task in statistical inference and generative modeling. We provide a priori error estimates in the continuum limit, i.e., when the measures (or their densities) are given, but when the transport map is discretized or approximated using a finite-dimensional function space. Our analysis relies on the regularity theory of transport maps and on classical approximation theory for high-dimensional functions. A third element of our analysis, which is of independent interest, is the development of new stability estimates that relate the distance between two maps to the distance~(or divergence) between the pushforward measures they define. We present a series of applications of our framework, where quantitative convergence rates are obtained for practical problems using Wasserstein metrics, maximum mean discrepancy, and Kullback--Leibler divergence. Specialized rates for approximations of the popular triangular Kn{ö}the-Rosenblatt maps are obtained, followed by numerical experiments that demonstrate and extend our theory.

An Approximation Theory Framework for Measure-Transport Sampling Algorithms

TL;DR

A general approximation-theoretic framework to analyze measure transport algorithms for probabilistic modeling, and the development of new stability estimates that relate the distance between two maps to the distance (or divergence) between the pushforward measures they define.

Abstract

This article presents a general approximation-theoretic framework to analyze measure transport algorithms for probabilistic modeling. A primary motivating application for such algorithms is sampling -- a central task in statistical inference and generative modeling. We provide a priori error estimates in the continuum limit, i.e., when the measures (or their densities) are given, but when the transport map is discretized or approximated using a finite-dimensional function space. Our analysis relies on the regularity theory of transport maps and on classical approximation theory for high-dimensional functions. A third element of our analysis, which is of independent interest, is the development of new stability estimates that relate the distance between two maps to the distance~(or divergence) between the pushforward measures they define. We present a series of applications of our framework, where quantitative convergence rates are obtained for practical problems using Wasserstein metrics, maximum mean discrepancy, and Kullback--Leibler divergence. Specialized rates for approximations of the popular triangular Kn{ö}the-Rosenblatt maps are obtained, followed by numerical experiments that demonstrate and extend our theory.
Paper Structure (35 sections, 22 theorems, 121 equations, 4 figures, 2 tables)

This paper contains 35 sections, 22 theorems, 121 equations, 4 figures, 2 tables.

Key Result

Theorem 2.2

\newlabelthm:abstract-error0 Suppose Assumption ass:abstract holds and consider $\widehat{\nu}$ as in eq:abst_opt. Then it holds that where $C>0$ is the same constant as in Assumption ass:abstract(i).

Figures (4)

  • Figure 1: Convergence results for approximating $\nu_k$ in \ref{['eq:nuk_compact1d']} using Legendre polynomials of degree $n$ and the optimization problem \ref{['eq:opt_wass_practical']} with $p=2$ for (a) $W_2 (\nu_k, \widehat{\nu}^n)$, (b) $\|T_k - \widehat{T}^n \|_{L^2}$.
  • Figure 2: Approximation of the map $T$ by minimizing the Wasserstein-2 distance for (a) the pushforward measure $(T_1)_\sharp\eta$ in \ref{['eq:nuk_compact1d']} using Legendre polynomials on a compact domain and (b) the Gumbel distribution in \ref{['eq:gumble']} using Hermite functions on the entire real line. \newlabelfig:Wp_map0
  • Figure 3: Convergence of the pushforward measure to the Gumbel distribution in terms of Wasserstein objective in \ref{['eq:opt_wass_practical']}, the empirical Wasserstein distance, and the $L^p$ error in the approximate map $\widehat{T}^n$ when (a) solving \ref{['eq:opt_wass_practical']} for $p=1$, and (b) using \ref{['eq:wass_minim_ls']} for $p=2$. The dashed lines illustrate exponential convergence rates based on empirical fits to the computed Wasserstein distances. \newlabelfig:Wp_gumbel_poly0
  • Figure 4: (a) Convergence of the pullback of a Gaussian reference $\eta$ to the Gumbel distribution in terms of KL divergence and the $L^2$ error in the approximate map $\widehat{T}^n$ when solving \ref{['eq:KLminimization_GaussianRef_obj']}. (b) The true ($T^\dagger$) and approximate ($\widehat{T}^n$) maps found with the monotone parameterization in Definition \ref{['defn:monotone_triangular_maps']}, using Hermite functions of increasing polynomial degree.

Theorems & Definitions (38)

  • Theorem 2.2
  • Proof 1
  • Theorem 3.1
  • Proof 2
  • Theorem 3.2
  • Proof 3
  • Remark 3.3
  • Remark 3.4: Applicability of the hypotheses
  • Theorem 3.5
  • Theorem 3.6
  • ...and 28 more