A solvable model of learning generative diffusion: theory and insights
Hugo Cui, Cengiz Pehlevan, Yue M. Lu
TL;DR
The paper develops a solvable model for learning generative diffusion through a two-layer denoising autoencoder trained with online SGD, targeting high-dimensional densities with low-dimensional manifolds. It derives a tight, two-tier asymptotic description: first, deterministic ODEs for low-dimensional weight-summary statistics that capture learning dynamics; second, a reduced, low-dimensional transport SDE describing how generated samples evolve, with a fixed-projection corollary. This yields sharp, interpretable characterizations of low-dimensional projections of the generated density and illuminates training-time evolution, including realistic targets like Gaussian mixtures and MNIST. The study reveals architectural biases in the DAE that can cause mode collapse and, if synthetic data are reused, model collapse, underscoring the critical role of network design in diffusion-based generative modeling and its implications for reuse of generated data.
Abstract
In this manuscript, we consider the problem of learning a flow or diffusion-based generative model parametrized by a two-layer auto-encoder, trained with online stochastic gradient descent, on a high-dimensional target density with an underlying low-dimensional manifold structure. We derive a tight asymptotic characterization of low-dimensional projections of the distribution of samples generated by the learned model, ascertaining in particular its dependence on the number of training samples. Building on this analysis, we discuss how mode collapse can arise, and lead to model collapse when the generative model is re-trained on generated synthetic data.
