Generative Modeling via Drifting
Mingyang Deng, He Li, Tianhong Li, Yilun Du, Kaiming He
TL;DR
Drifting Models introduce a training-time evolution of the pushforward distribution $q = f_\theta{}_{\#} p_{\boldsymbol{\epsilon}}$ via a drifting field $\mathbf{V}_{p,q}$ that vanishes at equilibrium when $q = p_{\text{data}}$, enabling a single-pass, one-step generator. The method employs a kernelized attraction-to-data and repulsion-from-generated-samples drift, with a fixed-point training objective and stop-gradient targets to align $q$ with $p_{\text{data}}$. It extends drifting to feature space, supports multi-scale representations, and can incorporate classifier-free guidance by conditioning on class and unconditional data. Empirically, it achieves state-of-the-art 1-NFE FID scores on ImageNet 256×256 in both latent ($\mathrm{FID}=1.54$) and pixel space ($\mathrm{FID}=1.61$), and demonstrates strong performance in latent and pixel-space generation as well as robotics control, illustrating a practical, diffusion-free paradigm for high-quality, efficient generation.
Abstract
Generative modeling can be formulated as learning a mapping f such that its pushforward distribution matches the data distribution. The pushforward behavior can be carried out iteratively at inference time, for example in diffusion and flow-based models. In this paper, we propose a new paradigm called Drifting Models, which evolve the pushforward distribution during training and naturally admit one-step inference. We introduce a drifting field that governs the sample movement and achieves equilibrium when the distributions match. This leads to a training objective that allows the neural network optimizer to evolve the distribution. In experiments, our one-step generator achieves state-of-the-art results on ImageNet at 256 x 256 resolution, with an FID of 1.54 in latent space and 1.61 in pixel space. We hope that our work opens up new opportunities for high-quality one-step generation.
