Manifold-Aligned Generative Transport

Xinyu Tian; Xiaotong Shen

Manifold-Aligned Generative Transport

Xinyu Tian, Xiaotong Shen

TL;DR

MAGT (Manifold-Aligned Generative Transport), a flow-like generator that learns a one-shot, manifold-aligned transport from a low-dimensional base distribution to the data space, is proposed, and finite-sample Wasserstein bounds linking smoothing level and score-approximation accuracy to generative fidelity are established.

Abstract

High-dimensional generative modeling is fundamentally a manifold-learning problem: real data concentrate near a low-dimensional structure embedded in the ambient space. Effective generators must therefore balance support fidelity -- placing probability mass near the data manifold -- with sampling efficiency. Diffusion models often capture near-manifold structure but require many iterative denoising steps and can leak off-support; normalizing flows sample in one pass but are limited by invertibility and dimension preservation. We propose MAGT (Manifold-Aligned Generative Transport), a flow-like generator that learns a one-shot, manifold-aligned transport from a low-dimensional base distribution to the data space. Training is performed at a fixed Gaussian smoothing level, where the score is well-defined and numerically stable. We approximate this fixed-level score using a finite set of latent anchor points with self-normalized importance sampling, yielding a tractable objective. MAGT samples in a single forward pass, concentrates probability near the learned support, and induces an intrinsic density with respect to the manifold volume measure, enabling principled likelihood evaluation for generated samples. We establish finite-sample Wasserstein bounds linking smoothing level and score-approximation accuracy to generative fidelity, and empirically improve fidelity and manifold concentration across synthetic and benchmark datasets while sampling substantially faster than diffusion models.

Manifold-Aligned Generative Transport

TL;DR

Abstract

Paper Structure (65 sections, 25 theorems, 290 equations, 5 figures, 5 tables, 1 algorithm)

This paper contains 65 sections, 25 theorems, 290 equations, 5 figures, 5 tables, 1 algorithm.

Introduction
MAGT: Manifold-aligned generative transport
Dimension alignment via perturbation
Ambient Gaussian perturbations.
Score matching and generator
Choice of proposal distribution $\tilde{\pi}$.
Training loss.
Intrinsic density and likelihood evaluation.
Comparisons with diffusion and flow models
Sampling cost.
Support alignment.
Likelihoods on the support and at fixed smoothing.
Computational footprint and architectural freedom.
Statistical guarantees.
Theory: excess risk and generation fidelity
...and 50 more sections

Key Result

Theorem 1

Under Assumptions G1--G3, suppose the VP noise level $t\in(0,1)$ lies in the tube regime $t\le t_{\max}:=c_{\mathrm{tube}}^2\,\rho_{\mathcal{M}}^2, \theta_t:=\frac{C_N^{(\gamma)}\,t^\gamma}{(1-t)^\gamma}<1$. Then the one-shot generation error is controlled by the single-level mismatch: where the pull-back constant is Here, with The constants $C_T^{(\gamma)}$, $C_S^{(\gamma)}$, and $C_N^{(\gamm

Figures (5)

Figure 1: Qualitative comparison of generative models on six synthetic manifolds. Each row corresponds to one toy dataset (rings2d, spiral2d, moons2d, checker2d, helix3d, torus3d). Columns show, from left to right, ground-truth samples, MAGT one-shot transport samples, diffusion-model samples generated with DDIM, and flow-matching samples.
Figure 2: Effect of sample size $n$, anchor count $K$, and smoothing level $t$ on MAGT and MAGT--DDIM across six synthetic benchmarks. Curves report Wasserstein distance ($W_2$; lower is better). Consistent with the bias--variance trade-off in Section \ref{['sec:risk']}, increasing $n$ and $K$ improves fidelity, while intermediate noise levels provide the most stable performance.
Figure 3: Unconditional generation on MNIST, comparing samples from MAGT, DDIM, and flow matching (FM), alongside held-out real test images (left to right).
Figure 4: Unconditional generation results on CIFAR10-0 (airplanes), comparing MAGT (left) and flow matching (FM) (right).
Figure 5: Class-wise PCA projections for five classes, comparing real genomic data with samples generated by MAGT and diffusion-based models.

Theorems & Definitions (51)

Definition 1: Hölder class
Definition 2: Reach and tubular neighborhood
Remark 1
Definition 3: Log--Sobolev constant
Theorem 1: Single-level pull-back bound
Corollary 1: Training-to-$W_2$ pipeline
Theorem 2: Score-matching excess-risk bound
Theorem 3: MAGT's generation fidelity
Corollary 2: Explicit $n$--rate
Lemma 1: $K$-approximation error
...and 41 more

Manifold-Aligned Generative Transport

TL;DR

Abstract

Manifold-Aligned Generative Transport

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (51)