Table of Contents
Fetching ...

Convex Potential Flows: Universal Probability Distributions with Optimal Transport and Convex Optimization

Chin-Wei Huang, Ricky T. Q. Chen, Christos Tsirigotis, Aaron Courville

TL;DR

Convex Potential Flows (CP-Flow) introduce a principled, invertible flow parameterization as the gradient of a strongly convex potential, enabling efficient inversion via convex optimization and a memory-efficient, gradient-estimator for the log-determinant of the Jacobian using a conjugate-gradient approach. Grounded in optimal transport theory, CP-Flow is proven to be distributionally universal and OT-optimal, aligning the learned mapping with the Brenier map for quadratic costs. Empirically, CP-Flow achieves competitive density estimation and variational inference results with potentially fewer parameters than competing flows, while offering robust inversion and principled gradient estimation. The framework highlights the practical viability of combining OT insights with convex optimization in discrete-time normalizing flows and points to architectural refinements (e.g., ICNN variants) as a path to further gains.

Abstract

Flow-based models are powerful tools for designing probabilistic models with tractable density. This paper introduces Convex Potential Flows (CP-Flow), a natural and efficient parameterization of invertible models inspired by the optimal transport (OT) theory. CP-Flows are the gradient map of a strongly convex neural potential function. The convexity implies invertibility and allows us to resort to convex optimization to solve the convex conjugate for efficient inversion. To enable maximum likelihood training, we derive a new gradient estimator of the log-determinant of the Jacobian, which involves solving an inverse-Hessian vector product using the conjugate gradient method. The gradient estimator has constant-memory cost, and can be made effectively unbiased by reducing the error tolerance level of the convex optimization routine. Theoretically, we prove that CP-Flows are universal density approximators and are optimal in the OT sense. Our empirical results show that CP-Flow performs competitively on standard benchmarks of density estimation and variational inference.

Convex Potential Flows: Universal Probability Distributions with Optimal Transport and Convex Optimization

TL;DR

Convex Potential Flows (CP-Flow) introduce a principled, invertible flow parameterization as the gradient of a strongly convex potential, enabling efficient inversion via convex optimization and a memory-efficient, gradient-estimator for the log-determinant of the Jacobian using a conjugate-gradient approach. Grounded in optimal transport theory, CP-Flow is proven to be distributionally universal and OT-optimal, aligning the learned mapping with the Brenier map for quadratic costs. Empirically, CP-Flow achieves competitive density estimation and variational inference results with potentially fewer parameters than competing flows, while offering robust inversion and principled gradient estimation. The framework highlights the practical viability of combining OT insights with convex optimization in discrete-time normalizing flows and points to architectural refinements (e.g., ICNN variants) as a path to further gains.

Abstract

Flow-based models are powerful tools for designing probabilistic models with tractable density. This paper introduces Convex Potential Flows (CP-Flow), a natural and efficient parameterization of invertible models inspired by the optimal transport (OT) theory. CP-Flows are the gradient map of a strongly convex neural potential function. The convexity implies invertibility and allows us to resort to convex optimization to solve the convex conjugate for efficient inversion. To enable maximum likelihood training, we derive a new gradient estimator of the log-determinant of the Jacobian, which involves solving an inverse-Hessian vector product using the conjugate gradient method. The gradient estimator has constant-memory cost, and can be made effectively unbiased by reducing the error tolerance level of the convex optimization routine. Theoretically, we prove that CP-Flows are universal density approximators and are optimal in the OT sense. Our empirical results show that CP-Flow performs competitively on standard benchmarks of density estimation and variational inference.

Paper Structure

This paper contains 42 sections, 10 theorems, 28 equations, 11 figures, 10 tables, 2 algorithms.

Key Result

Theorem 1

Let $\mu,\nu$ be probability measures with a finite second moment, and assume $\mu$ has a Lebesgue density $p_X$. Then there exists a convex potential $G$ such that the gradient map $g:=\nabla G$ (defined up to a null set) uniquely solves the Monge problem in eq:monge with the quadratic cost functio

Figures (11)

  • Figure 1: Illustration of Convex Potential Flow. (a) Data $x$ drawn from a mixture of Gaussians. (b) Learned convex potential $F$. (c) Mesh grid distorted by the gradient map of the convex potential $f=\nabla F$. (d) Encoding of the data via the gradient map $z=f(x)$. Notably, the encoding is the value of the gradient of the convex potential. When the curvature of the potential function is locally flat, gradient values are small and this results in a contraction towards the origin.
  • Figure 2: Memory for training CIFAR-10.
  • Figure 3: Learning toy densities.
  • Figure 4: Approximating optimal transport map via maximum likelihood (minimizing KL divergence). In the first figure on the left we show the data in 2 dimensions. The datapoints are colored according to their horizontal values ($x_1$). The flows $f_{iaf}$ and $f_{cp}$ are trained to transform the data into a standard Gaussian prior. In the figures on the right, we plot the expected quadratic transportation cost versus the KL divergence for different numbers of dimensionality. During training the KL is minimized, so the curves read from the right to the left.
  • Figure 5: Softplus-type functions.
  • ...and 6 more figures

Theorems & Definitions (17)

  • Theorem 1: Brenier's Theorem, Theorem 1.22 of santambrogio2015optimal
  • Theorem 2
  • Theorem 3: Universality
  • Theorem 4: Optimality
  • Definition 1
  • Proposition 1
  • proof
  • Proposition 2
  • proof
  • Proposition 3
  • ...and 7 more