Table of Contents
Fetching ...

Sample and Map from a Single Convex Potential: Generation using Conjugate Moment Measures

Nina Vesseron, Louis Béthune, Marco Cuturi

TL;DR

This work proposes a shift from the standard two-block generative paradigm by introducing conjugate moment measures, which express a target density ρ as ρ = ∇ w^* # 𝔓_w with 𝔓_w ∝ e^{−w}. The authors establish a theoretical existence result via a Schauder fixed-point argument and provide practical learning and sampling procedures, including an ICNN-based parameterization of w and two methods CMFGen and CMFMA for learning from samples or energies. They derive Monge–Ampère based relations linking ρ to the conjugate potential and demonstrate the approach on univariate, 2D, and high-dimensional tasks such as MNIST and Cartoon image generation and inpainting, where conjugate-based sampling outperforms standard generative ICNN baselines. The results suggest that aligning the Gibbs factor with the target distribution improves sampling quality and opens avenues for using 𝔓_w as a pre-trained noise model within broader generative pipelines.

Abstract

The canonical approach in generative modeling is to split model fitting into two blocks: define first how to sample noise (e.g. Gaussian) and choose next what to do with it (e.g. using a single map or flows). We explore in this work an alternative route that ties sampling and mapping. We find inspiration in moment measures, a result that states that for any measure $ρ$, there exists a unique convex potential $u$ such that $ρ=\nabla u \sharp e^{-u}$. While this does seem to tie effectively sampling (from log-concave distribution $e^{-u}$) and action (pushing particles through $\nabla u$), we observe on simple examples (e.g., Gaussians or 1D distributions) that this choice is ill-suited for practical tasks. We study an alternative factorization, where $ρ$ is factorized as $\nabla w^*\sharp e^{-w}$, where $w^*$ is the convex conjugate of a convex potential $w$. We call this approach conjugate moment measures, and show far more intuitive results on these examples. Because $\nabla w^*$ is the Monge map between the log-concave distribution $e^{-w}$ and $ρ$, we rely on optimal transport solvers to propose an algorithm to recover $w$ from samples of $ρ$, and parameterize $w$ as an input-convex neural network. We also address the common sampling scenario in which the density of $ρ$ is known only up to a normalizing constant, and propose an algorithm to learn $w$ in this setting.

Sample and Map from a Single Convex Potential: Generation using Conjugate Moment Measures

TL;DR

This work proposes a shift from the standard two-block generative paradigm by introducing conjugate moment measures, which express a target density ρ as ρ = ∇ w^* # 𝔓_w with 𝔓_w ∝ e^{−w}. The authors establish a theoretical existence result via a Schauder fixed-point argument and provide practical learning and sampling procedures, including an ICNN-based parameterization of w and two methods CMFGen and CMFMA for learning from samples or energies. They derive Monge–Ampère based relations linking ρ to the conjugate potential and demonstrate the approach on univariate, 2D, and high-dimensional tasks such as MNIST and Cartoon image generation and inpainting, where conjugate-based sampling outperforms standard generative ICNN baselines. The results suggest that aligning the Gibbs factor with the target distribution improves sampling quality and opens avenues for using 𝔓_w as a pre-trained noise model within broader generative pipelines.

Abstract

The canonical approach in generative modeling is to split model fitting into two blocks: define first how to sample noise (e.g. Gaussian) and choose next what to do with it (e.g. using a single map or flows). We explore in this work an alternative route that ties sampling and mapping. We find inspiration in moment measures, a result that states that for any measure , there exists a unique convex potential such that . While this does seem to tie effectively sampling (from log-concave distribution ) and action (pushing particles through ), we observe on simple examples (e.g., Gaussians or 1D distributions) that this choice is ill-suited for practical tasks. We study an alternative factorization, where is factorized as , where is the convex conjugate of a convex potential . We call this approach conjugate moment measures, and show far more intuitive results on these examples. Because is the Monge map between the log-concave distribution and , we rely on optimal transport solvers to propose an algorithm to recover from samples of , and parameterize as an input-convex neural network. We also address the common sampling scenario in which the density of is known only up to a normalizing constant, and propose an algorithm to learn in this setting.

Paper Structure

This paper contains 53 sections, 9 theorems, 44 equations, 18 figures, 5 tables, 1 algorithm.

Key Result

Proposition 1

Let $\rho = \mathcal{N}(0_{\mathbb{R}^d}, \Sigma)$. If $\Sigma$ is non degenerate, the moment potentials of $\rho$ are $u_m(x) = \frac{1}{2} (x-m)^T\Sigma(x-m),$ with $m \in \mathbb{R}^d$. The associated Gibbs factor of $\rho$ is $\mathfrak{P}_{u_m} = \mathcal{N}(m, \Sigma^{-1})$.

Figures (18)

  • Figure 1: Gibbs factor and conjugate Gibbs factor$\,$ of $\rho = \mathcal{N}\left(0, 21.81.82\right)$.
  • Figure 2: Comparison between the Gibbs factor$\mathfrak{P}_u$ and the conjugate Gibbs factor$\mathfrak{P}_w$ for two mixtures of 1D Gaussian distributions, $\bm \rho_1$ and $\bm \rho_2$. The density plots overlay the (conjugate) Gibbs factor with $\bm \rho$ and a standard Gaussian $\bm{\mathcal{N}(0,1)}$ for reference. Gibbs factors spread inversely to $\rho$ ((a), (c)) while conjugate Gibbs factors show more suitable alignment ((b), (d)).
  • Figure 3: Learning the conjugate moment potential from an energy. $\mathcal{E}_1$ and $\mathcal{E}_2$ are learned by regression with CMFMA. The second column shows the learned energy; the third displays the corresponding conjugate moment potential; the fourth shows samples (in red) drawn from $\nabla w_\theta^* \sharp \mathfrak{P}_{w_\theta}$.
  • Figure 4: Samples from $\rho$ (top), level sets of $w_\theta$ (middle) and samples from $\nabla w_\theta^*\,\sharp\, e^{-w_\theta}$ (bottom).
  • Figure 5: MNIST Generation using CMFGen. Samples from the Gibbs noise distribution $\mathfrak{P}_{w_\theta}$ (left); digits generated from $\nabla w_{\theta}^* \sharp \mathfrak{P}_{w_\theta}$ (middle); digits generated by an ICNN trained to directly transport Gaussian noise to MNIST (right).
  • ...and 13 more figures

Theorems & Definitions (13)

  • Proposition 1
  • proof
  • Theorem 1
  • Proposition 2
  • Lemma 1
  • Theorem : Schauder's fixed point Theorem
  • Lemma 2
  • Lemma 3
  • Definition
  • Definition
  • ...and 3 more