Table of Contents
Fetching ...

PFGM++: Unlocking the Potential of Physics-Inspired Generative Models

Yilun Xu, Ziming Liu, Yonglong Tian, Shangyuan Tong, Max Tegmark, Tommi Jaakkola

TL;DR

PFGM++ presents a unifying, physics-inspired framework that augments data with a D-dimensional latent and tracks sampling trajectories using a scalar radius, bridging Poisson Flow Generative Models and diffusion models. A perturbation-based objective enables unbiased training without large batches, while a formal $D\to\infty$ limit recovers standard diffusion-model training and sampling, and $D$-dependent dynamics offer a robustness–rigidity trade-off. Empirically, finite $D$ values (notably intermediate scales like $D=2048$ or $D=128$) can outperform state-of-the-art diffusion models on CIFAR-10 and FFHQ-64x64, with a current-state-of-the-art class-conditional CIFAR-10 FID of $1.74$ at $D=2048$. An alignment method based on phase alignment $r=\sigma\sqrt{D}$ enables zero-shot hyperparameter transfer from diffusion models to finite $D$, facilitating practical deployment and robustness improvements. The work highlights a new dimension to optimize generative modeling: balancing robustness against learning rigidity by choosing an appropriate augmentation dimension $D$.

Abstract

We introduce a new family of physics-inspired generative models termed PFGM++ that unifies diffusion models and Poisson Flow Generative Models (PFGM). These models realize generative trajectories for $N$ dimensional data by embedding paths in $N{+}D$ dimensional space while still controlling the progression with a simple scalar norm of the $D$ additional variables. The new models reduce to PFGM when $D{=}1$ and to diffusion models when $D{\to}\infty$. The flexibility of choosing $D$ allows us to trade off robustness against rigidity as increasing $D$ results in more concentrated coupling between the data and the additional variable norms. We dispense with the biased large batch field targets used in PFGM and instead provide an unbiased perturbation-based objective similar to diffusion models. To explore different choices of $D$, we provide a direct alignment method for transferring well-tuned hyperparameters from diffusion models ($D{\to} \infty$) to any finite $D$ values. Our experiments show that models with finite $D$ can be superior to previous state-of-the-art diffusion models on CIFAR-10/FFHQ $64{\times}64$ datasets, with FID scores of $1.91/2.43$ when $D{=}2048/128$. In class-conditional setting, $D{=}2048$ yields current state-of-the-art FID of $1.74$ on CIFAR-10. In addition, we demonstrate that models with smaller $D$ exhibit improved robustness against modeling errors. Code is available at https://github.com/Newbeeer/pfgmpp

PFGM++: Unlocking the Potential of Physics-Inspired Generative Models

TL;DR

PFGM++ presents a unifying, physics-inspired framework that augments data with a D-dimensional latent and tracks sampling trajectories using a scalar radius, bridging Poisson Flow Generative Models and diffusion models. A perturbation-based objective enables unbiased training without large batches, while a formal limit recovers standard diffusion-model training and sampling, and -dependent dynamics offer a robustness–rigidity trade-off. Empirically, finite values (notably intermediate scales like or ) can outperform state-of-the-art diffusion models on CIFAR-10 and FFHQ-64x64, with a current-state-of-the-art class-conditional CIFAR-10 FID of at . An alignment method based on phase alignment enables zero-shot hyperparameter transfer from diffusion models to finite , facilitating practical deployment and robustness improvements. The work highlights a new dimension to optimize generative modeling: balancing robustness against learning rigidity by choosing an appropriate augmentation dimension .

Abstract

We introduce a new family of physics-inspired generative models termed PFGM++ that unifies diffusion models and Poisson Flow Generative Models (PFGM). These models realize generative trajectories for dimensional data by embedding paths in dimensional space while still controlling the progression with a simple scalar norm of the additional variables. The new models reduce to PFGM when and to diffusion models when . The flexibility of choosing allows us to trade off robustness against rigidity as increasing results in more concentrated coupling between the data and the additional variable norms. We dispense with the biased large batch field targets used in PFGM and instead provide an unbiased perturbation-based objective similar to diffusion models. To explore different choices of , we provide a direct alignment method for transferring well-tuned hyperparameters from diffusion models () to any finite values. Our experiments show that models with finite can be superior to previous state-of-the-art diffusion models on CIFAR-10/FFHQ datasets, with FID scores of when . In class-conditional setting, yields current state-of-the-art FID of on CIFAR-10. In addition, we demonstrate that models with smaller exhibit improved robustness against modeling errors. Code is available at https://github.com/Newbeeer/pfgmpp
Paper Structure (38 sections, 8 theorems, 41 equations, 9 figures, 4 tables, 8 algorithms)

This paper contains 38 sections, 8 theorems, 41 equations, 9 figures, 4 tables, 8 algorithms.

Key Result

Theorem 3.1

Assume the data distribution $p\in {\mathcal{C}}^1$ and $p$ has compact support. As $r_{\textrm{max}}{\to} \infty$, for $D \in \mathbb{R}^+$, the ODE ${\mathrm{d}{\mathbf{x}}}/{\mathrm{d}r} = \mathbf{E}(\tilde{\mathbf{x}})_{\mathbf{x}} /E(\tilde{\mathbf{x}})_{r}$ defines a bijection between $\lim_{

Figures (9)

  • Figure 1: Overview of paper contributions and structure. PFGM++ unify PFGM and diffusion models, as well as the potential to combine their strengths (robustness and rigidity).
  • Figure 2: The augmented dimension $D$ affects electric field lines (gray), which connect charge/data on a line (purple) to latent space (green). When $D=1$ (top) or $D=2$ (bottom), electric field lines map the same red line segment to a blue line segment or onto a blue ring, respectively. The mapping defined by electric lines has $SO(2)$ symmetry on the surface of $z_1^2+z_2^2=r^2$ cylinder.
  • Figure 3: Mean TVD between the posterior $p_{0|r}(\cdot|{\mathbf{x}})$ (${\mathbf{x}}$ is perturbed sample) and the uniform prior, w/o (a) and w/ (b) the phase alignment ($r=\sigma\sqrt{D}$).
  • Figure 4: (a) Average $\ell_2$ difference between scaled electric field and score function, versus $D$. (b) Log-variance of radius distribution versus $D$. (c) Density of radius distributions $p_{r=\sigma\sqrt{D}}(R)$ with varying $\sigma$ and $D$.
  • Figure 5: FID score versus (left)$\alpha$ and (right) NFE on CIFAR-10.
  • ...and 4 more figures

Theorems & Definitions (12)

  • Theorem 3.1
  • Proposition 3.1
  • Theorem 4.1
  • Proposition 4.1
  • Theorem 1.1
  • proof
  • Proposition 1.0
  • proof
  • Theorem 1.1
  • proof
  • ...and 2 more