Table of Contents
Fetching ...

Diffusion map particle systems for generative modeling

Fengyi Li, Youssef Marzouk

TL;DR

This paper introduces Diffusion Map Particle Systems (DMPS), a nonparametric generative modeling framework that marries diffusion maps with Laplacian-adjusted Wasserstein gradient descent (LAWGD). By approximating the Langevin generator $\mathscr{L}$ via diffusion maps and using its inverse kernel $K_{\mathscr{L}^{-1}}$, DMPS drives particles to sample from the target distribution on manifolds without offline training, with a single kernel bandwidth parameter guiding behavior. The authors provide a spectral analysis showing exponential decay of the KL divergence up to an $O(\varepsilon)$ diffusion-map bias, and demonstrate competitive to superior performance across a suite of synthetic manifolds and a real high-energy physics dataset, often outperforming SVGD, ULA, and diffusion-based generative models. The method is simple to implement, scales to moderate dimensions, and naturally leverages geometry via graph-Laplacian constructs, suggesting promising extensions with more advanced kernels for higher-dimensional problems.

Abstract

We propose a novel diffusion map particle system (DMPS) for generative modeling, based on diffusion maps and Laplacian-adjusted Wasserstein gradient descent (LAWGD). Diffusion maps are used to approximate the generator of the corresponding Langevin diffusion process from samples, and hence to learn the underlying data-generating manifold. On the other hand, LAWGD enables efficient sampling from the target distribution given a suitable choice of kernel, which we construct here via a spectral approximation of the generator, computed with diffusion maps. Our method requires no offline training and minimal tuning, and can outperform other approaches on data sets of moderate dimension.

Diffusion map particle systems for generative modeling

TL;DR

This paper introduces Diffusion Map Particle Systems (DMPS), a nonparametric generative modeling framework that marries diffusion maps with Laplacian-adjusted Wasserstein gradient descent (LAWGD). By approximating the Langevin generator via diffusion maps and using its inverse kernel , DMPS drives particles to sample from the target distribution on manifolds without offline training, with a single kernel bandwidth parameter guiding behavior. The authors provide a spectral analysis showing exponential decay of the KL divergence up to an diffusion-map bias, and demonstrate competitive to superior performance across a suite of synthetic manifolds and a real high-energy physics dataset, often outperforming SVGD, ULA, and diffusion-based generative models. The method is simple to implement, scales to moderate dimensions, and naturally leverages geometry via graph-Laplacian constructs, suggesting promising extensions with more advanced kernels for higher-dimensional problems.

Abstract

We propose a novel diffusion map particle system (DMPS) for generative modeling, based on diffusion maps and Laplacian-adjusted Wasserstein gradient descent (LAWGD). Diffusion maps are used to approximate the generator of the corresponding Langevin diffusion process from samples, and hence to learn the underlying data-generating manifold. On the other hand, LAWGD enables efficient sampling from the target distribution given a suitable choice of kernel, which we construct here via a spectral approximation of the generator, computed with diffusion maps. Our method requires no offline training and minimal tuning, and can outperform other approaches on data sets of moderate dimension.
Paper Structure (28 sections, 5 theorems, 72 equations, 9 figures, 2 tables, 1 algorithm)

This paper contains 28 sections, 5 theorems, 72 equations, 9 figures, 2 tables, 1 algorithm.

Key Result

Theorem 3.1

Let $T_\epsilon$ and $P_\epsilon(x,y)$ be defined as in the previous discussion. Then there is a sequence of non-negative eigenvalues $\{\lambda_i\}_{i \in \mathbb N}$ and an orthonormal basis of eigenfunctions $\{\phi_i\}_{i \in \mathbb N}$ of $T_\epsilon$, i.e., such that

Figures (9)

  • Figure 1: Mickey mouse: 2700 generated particles using DPMS and SVGD, with 2000 training samples
  • Figure 2: Mickey mouse: an instance of running the SVGD generative model shows strange non-uniform pattern with 1000 training samples and 2700 generated particles
  • Figure 3: Mickey mouse: error comparison between DMPS, SVGD
  • Figure 4: Two moons: 900 generated particles from DMPS, SVGD, and ULA with 500 training samples
  • Figure 5: Two moons: error comparison between DMPS, SVGD, and ULA. Solid lines use 500 training samples, dashed lines use 1000.
  • ...and 4 more figures

Theorems & Definitions (8)

  • Theorem 3.1: Mercer
  • Corollary 3.2
  • Theorem 5.1
  • proof
  • Theorem C.1
  • proof
  • Theorem D.1
  • proof