PFGM++: Unlocking the Potential of Physics-Inspired Generative Models
Yilun Xu, Ziming Liu, Yonglong Tian, Shangyuan Tong, Max Tegmark, Tommi Jaakkola
TL;DR
PFGM++ presents a unifying, physics-inspired framework that augments data with a D-dimensional latent and tracks sampling trajectories using a scalar radius, bridging Poisson Flow Generative Models and diffusion models. A perturbation-based objective enables unbiased training without large batches, while a formal $D\to\infty$ limit recovers standard diffusion-model training and sampling, and $D$-dependent dynamics offer a robustness–rigidity trade-off. Empirically, finite $D$ values (notably intermediate scales like $D=2048$ or $D=128$) can outperform state-of-the-art diffusion models on CIFAR-10 and FFHQ-64x64, with a current-state-of-the-art class-conditional CIFAR-10 FID of $1.74$ at $D=2048$. An alignment method based on phase alignment $r=\sigma\sqrt{D}$ enables zero-shot hyperparameter transfer from diffusion models to finite $D$, facilitating practical deployment and robustness improvements. The work highlights a new dimension to optimize generative modeling: balancing robustness against learning rigidity by choosing an appropriate augmentation dimension $D$.
Abstract
We introduce a new family of physics-inspired generative models termed PFGM++ that unifies diffusion models and Poisson Flow Generative Models (PFGM). These models realize generative trajectories for $N$ dimensional data by embedding paths in $N{+}D$ dimensional space while still controlling the progression with a simple scalar norm of the $D$ additional variables. The new models reduce to PFGM when $D{=}1$ and to diffusion models when $D{\to}\infty$. The flexibility of choosing $D$ allows us to trade off robustness against rigidity as increasing $D$ results in more concentrated coupling between the data and the additional variable norms. We dispense with the biased large batch field targets used in PFGM and instead provide an unbiased perturbation-based objective similar to diffusion models. To explore different choices of $D$, we provide a direct alignment method for transferring well-tuned hyperparameters from diffusion models ($D{\to} \infty$) to any finite $D$ values. Our experiments show that models with finite $D$ can be superior to previous state-of-the-art diffusion models on CIFAR-10/FFHQ $64{\times}64$ datasets, with FID scores of $1.91/2.43$ when $D{=}2048/128$. In class-conditional setting, $D{=}2048$ yields current state-of-the-art FID of $1.74$ on CIFAR-10. In addition, we demonstrate that models with smaller $D$ exhibit improved robustness against modeling errors. Code is available at https://github.com/Newbeeer/pfgmpp
