Table of Contents
Fetching ...

Touring sampling with pushforward maps

Vivien Cabannes, Charles Arnal

TL;DR

This paper surveys principled ways to obtain fast, deterministic pushforward maps between distributions in generative modeling, unifying test-function, density, and flow-based viewpoints. It clarifies how diffusion and flow-based samplers relate to pushforward maps, and shows equivalences to integral probability metrics such as MMD and Wasserstein distances. Key contributions include a unified variational framework using $D_inf(mu, mu_1)$ and its mean-version surrogate $D_2^2(mu, mu_1; tau)$, along with connections to likelihood-based training and normalizing flows. The work discusses practical implications for speed-accuracy trade-offs, including distillation and direct transport maps, and offers a roadmap for integrating transport- and diffusion-based methods to achieve fast, diverse generation in practice.

Abstract

The number of sampling methods could be daunting for a practitioner looking to cast powerful machine learning methods to their specific problem. This paper takes a theoretical stance to review and organize many sampling approaches in the ``generative modeling'' setting, where one wants to generate new data that are similar to some training examples. By revealing links between existing methods, it might prove useful to overcome some of the current challenges in sampling with diffusion models, such as long inference time due to diffusion simulation, or the lack of diversity in generated samples.

Touring sampling with pushforward maps

TL;DR

This paper surveys principled ways to obtain fast, deterministic pushforward maps between distributions in generative modeling, unifying test-function, density, and flow-based viewpoints. It clarifies how diffusion and flow-based samplers relate to pushforward maps, and shows equivalences to integral probability metrics such as MMD and Wasserstein distances. Key contributions include a unified variational framework using and its mean-version surrogate , along with connections to likelihood-based training and normalizing flows. The work discusses practical implications for speed-accuracy trade-offs, including distillation and direct transport maps, and offers a roadmap for integrating transport- and diffusion-based methods to achieve fast, diverse generation in practice.

Abstract

The number of sampling methods could be daunting for a practitioner looking to cast powerful machine learning methods to their specific problem. This paper takes a theoretical stance to review and organize many sampling approaches in the ``generative modeling'' setting, where one wants to generate new data that are similar to some training examples. By revealing links between existing methods, it might prove useful to overcome some of the current challenges in sampling with diffusion models, such as long inference time due to diffusion simulation, or the lack of diversity in generated samples.
Paper Structure (18 sections, 15 equations, 1 figure)

This paper contains 18 sections, 15 equations, 1 figure.

Figures (1)

  • Figure 1: Learning to map the Gaussian distribution to the uniform one with the mean formulation $D_2^2$, ${\mathcal{F}}$ taken as the set of function ${x\in[0,1]\mapsto e^{2\pi i kx} | k\in[10]}$, with $\tau$ uniform over those ten functions, and $\varphi$ parameterized with a small multi-layer perceptron. A solution to this problem is given by the cumulative distribution function of the Gaussian, which relates to the error function. The ground truth is plotted in blue, while the learned pushforward is plotted in dashed orange.