Theory of Speciation Transitions in Diffusion Models with General Class Structure
Beatrice Achilli, Marco Benedetti, Giulio Biroli, Marc Mézard
TL;DR
The paper develops a general theory of speciation transitions in diffusion-based generative models for targets with arbitrary class structure by introducing Bayes attribution and a free-entropy criterion. A rigorous speciation time $t_{rs}$ is defined via the balance between the mean free-entropy difference and its fluctuations, yielding large-$N$ Scalings: $t_{rs}\sim \tfrac{1}{2}\log N$ when first moments separate and $t_{rs}\sim \tfrac{1}{4}\log N$ when they do not. The authors apply the framework to two Gaussian-mix scenarios and to multi-class 1D Ising mixtures, deriving analytical expressions (via replica methods for Ising) and validating predictions with numerical U-turn experiments, revealing hierarchical speciation times. The results provide a unified description of speciation transitions in diffusion models, enabling principled prediction of when and how trajectories commit to data classes across high-dimensional settings with diverse class definitions. The approach has broad applicability to diffusion-based generation beyond Gaussian mixtures, including models with complex, higher-order, or collective class features.
Abstract
Diffusion Models generate data by reversing a stochastic diffusion process, progressively transforming noise into structured samples drawn from a target distribution. Recent theoretical work has shown that this backward dynamics can undergo sharp qualitative transitions, known as speciation transitions, during which trajectories become dynamically committed to data classes. Existing theoretical analyses, however, are limited to settings where classes are identifiable through first moments, such as mixtures of Gaussians with well-separated means. In this work, we develop a general theory of speciation in diffusion models that applies to arbitrary target distributions admitting well-defined classes. We formalize the notion of class structure through Bayes classification and characterize speciation times in terms of free-entropy difference between classes. This criterion recovers known results in previously studied Gaussian-mixture models, while extending to situations in which classes are not distinguishable by first moments and may instead differ through higher-order or collective features. Our framework also accommodates multiple classes and predicts the existence of successive speciation times associated with increasingly fine-grained class commitment. We illustrate the theory on two analytically tractable examples: mixtures of one-dimensional Ising models at different temperatures and mixtures of zero-mean Gaussians with distinct covariance structures. In the Ising case, we obtain explicit expressions for speciation times by mapping the problem onto a random-field Ising model and solving it via the replica method. Our results provide a unified and broadly applicable description of speciation transitions in diffusion-based generative models.
