PFGM++: Unlocking the Potential of Physics-Inspired Generative Models

Yilun Xu; Ziming Liu; Yonglong Tian; Shangyuan Tong; Max Tegmark; Tommi Jaakkola

PFGM++: Unlocking the Potential of Physics-Inspired Generative Models

Yilun Xu, Ziming Liu, Yonglong Tian, Shangyuan Tong, Max Tegmark, Tommi Jaakkola

TL;DR

PFGM++ presents a unifying, physics-inspired framework that augments data with a D-dimensional latent and tracks sampling trajectories using a scalar radius, bridging Poisson Flow Generative Models and diffusion models. A perturbation-based objective enables unbiased training without large batches, while a formal $D\to\infty$ limit recovers standard diffusion-model training and sampling, and $D$-dependent dynamics offer a robustness–rigidity trade-off. Empirically, finite $D$ values (notably intermediate scales like $D=2048$ or $D=128$) can outperform state-of-the-art diffusion models on CIFAR-10 and FFHQ-64x64, with a current-state-of-the-art class-conditional CIFAR-10 FID of $1.74$ at $D=2048$. An alignment method based on phase alignment $r=\sigma\sqrt{D}$ enables zero-shot hyperparameter transfer from diffusion models to finite $D$, facilitating practical deployment and robustness improvements. The work highlights a new dimension to optimize generative modeling: balancing robustness against learning rigidity by choosing an appropriate augmentation dimension $D$.

Abstract

We introduce a new family of physics-inspired generative models termed PFGM++ that unifies diffusion models and Poisson Flow Generative Models (PFGM). These models realize generative trajectories for $N$ dimensional data by embedding paths in $N{+}D$ dimensional space while still controlling the progression with a simple scalar norm of the $D$ additional variables. The new models reduce to PFGM when $D{=}1$ and to diffusion models when $D{\to}\infty$. The flexibility of choosing $D$ allows us to trade off robustness against rigidity as increasing $D$ results in more concentrated coupling between the data and the additional variable norms. We dispense with the biased large batch field targets used in PFGM and instead provide an unbiased perturbation-based objective similar to diffusion models. To explore different choices of $D$, we provide a direct alignment method for transferring well-tuned hyperparameters from diffusion models ($D{\to} \infty$) to any finite $D$ values. Our experiments show that models with finite $D$ can be superior to previous state-of-the-art diffusion models on CIFAR-10/FFHQ $64{\times}64$ datasets, with FID scores of $1.91/2.43$ when $D{=}2048/128$. In class-conditional setting, $D{=}2048$ yields current state-of-the-art FID of $1.74$ on CIFAR-10. In addition, we demonstrate that models with smaller $D$ exhibit improved robustness against modeling errors. Code is available at https://github.com/Newbeeer/pfgmpp

PFGM++: Unlocking the Potential of Physics-Inspired Generative Models

TL;DR

limit recovers standard diffusion-model training and sampling, and

-dependent dynamics offer a robustness–rigidity trade-off. Empirically, finite

values (notably intermediate scales like

) can outperform state-of-the-art diffusion models on CIFAR-10 and FFHQ-64x64, with a current-state-of-the-art class-conditional CIFAR-10 FID of

. An alignment method based on phase alignment

enables zero-shot hyperparameter transfer from diffusion models to finite

, facilitating practical deployment and robustness improvements. The work highlights a new dimension to optimize generative modeling: balancing robustness against learning rigidity by choosing an appropriate augmentation dimension

Abstract

We introduce a new family of physics-inspired generative models termed PFGM++ that unifies diffusion models and Poisson Flow Generative Models (PFGM). These models realize generative trajectories for

dimensional data by embedding paths in

dimensional space while still controlling the progression with a simple scalar norm of the

additional variables. The new models reduce to PFGM when

and to diffusion models when

. The flexibility of choosing

allows us to trade off robustness against rigidity as increasing

results in more concentrated coupling between the data and the additional variable norms. We dispense with the biased large batch field targets used in PFGM and instead provide an unbiased perturbation-based objective similar to diffusion models. To explore different choices of

, we provide a direct alignment method for transferring well-tuned hyperparameters from diffusion models (

) to any finite

values. Our experiments show that models with finite

can be superior to previous state-of-the-art diffusion models on CIFAR-10/FFHQ

datasets, with FID scores of

when

. In class-conditional setting,

yields current state-of-the-art FID of

on CIFAR-10. In addition, we demonstrate that models with smaller

exhibit improved robustness against modeling errors. Code is available at https://github.com/Newbeeer/pfgmpp

Paper Structure (38 sections, 8 theorems, 41 equations, 9 figures, 4 tables, 8 algorithms)

This paper contains 38 sections, 8 theorems, 41 equations, 9 figures, 4 tables, 8 algorithms.

Introduction
Background and Related Works
PFGM++: A Novel Generative Framework
Electric field in $\bm{N{+}D}$-dimensional space
New objective with Perturbation Kernel
Diffusion Models as $\bm{D{\to} \infty}$ Special Cases
Transfer hyperparameters to finite $\bm{D}$s
Balancing Robustness and Rigidity
Behavior of perturbation kernel when varying $\bm{D}$
Balancing the trade-off by controlling $\bm{D}$
Experiments
Image generation
Model robustness versus $\bm{D}$
Conclusion and Future Directions
Proofs
...and 23 more sections

Key Result

Theorem 3.1

Assume the data distribution $p\in {\mathcal{C}}^1$ and $p$ has compact support. As $r_{\textrm{max}}{\to} \infty$, for $D \in \mathbb{R}^+$, the ODE ${\mathrm{d}{\mathbf{x}}}/{\mathrm{d}r} = \mathbf{E}(\tilde{\mathbf{x}})_{\mathbf{x}} /E(\tilde{\mathbf{x}})_{r}$ defines a bijection between $\lim_{

Figures (9)

Figure 1: Overview of paper contributions and structure. PFGM++ unify PFGM and diffusion models, as well as the potential to combine their strengths (robustness and rigidity).
Figure 2: The augmented dimension $D$ affects electric field lines (gray), which connect charge/data on a line (purple) to latent space (green). When $D=1$ (top) or $D=2$ (bottom), electric field lines map the same red line segment to a blue line segment or onto a blue ring, respectively. The mapping defined by electric lines has $SO(2)$ symmetry on the surface of $z_1^2+z_2^2=r^2$ cylinder.
Figure 3: Mean TVD between the posterior $p_{0|r}(\cdot|{\mathbf{x}})$ (${\mathbf{x}}$ is perturbed sample) and the uniform prior, w/o (a) and w/ (b) the phase alignment ($r=\sigma\sqrt{D}$).
Figure 4: (a) Average $\ell_2$ difference between scaled electric field and score function, versus $D$. (b) Log-variance of radius distribution versus $D$. (c) Density of radius distributions $p_{r=\sigma\sqrt{D}}(R)$ with varying $\sigma$ and $D$.
Figure 5: FID score versus (left)$\alpha$ and (right) NFE on CIFAR-10.
...and 4 more figures

Theorems & Definitions (12)

Theorem 3.1
Proposition 3.1
Theorem 4.1
Proposition 4.1
Theorem 1.1
proof
Proposition 1.0
proof
Theorem 1.1
proof
...and 2 more

PFGM++: Unlocking the Potential of Physics-Inspired Generative Models

TL;DR

Abstract

PFGM++: Unlocking the Potential of Physics-Inspired Generative Models

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (9)

Theorems & Definitions (12)