A Closer Look at Time Steps is Worthy of Triple Speed-Up for Diffusion Model Training
Kai Wang, Mingjia Shi, Yukun Zhou, Zekai Li, Zhihang Yuan, Yuzhang Shang, Xiaojiang Peng, Hanwang Zhang, Yang You
TL;DR
SpeeD tackles the high cost of diffusion-model training by dissecting the time-step dynamics into acceleration, deceleration, and convergence regions. It introduces asymmetric sampling to downweight convergence-area steps and change-aware weighting to emphasize rapid-change steps, yielding a consistent ~3× speed-up across architectures and datasets with negligible overhead. The approach is theoretically grounded, providing boundary definitions and generalization to s-sigma scheduled SDEs, and it proves robust across tasks, datasets, and competing acceleration methods. Practically, this work lowers the barrier to diffusion-model research by reducing training costs while maintaining or improving sample quality and applicability to conditional generation tasks.
Abstract
Training diffusion models is always a computation-intensive task. In this paper, we introduce a novel speed-up method for diffusion model training, called, which is based on a closer look at time steps. Our key findings are: i) Time steps can be empirically divided into acceleration, deceleration, and convergence areas based on the process increment. ii) These time steps are imbalanced, with many concentrated in the convergence area. iii) The concentrated steps provide limited benefits for diffusion training. To address this, we design an asymmetric sampling strategy that reduces the frequency of steps from the convergence area while increasing the sampling probability for steps from other areas. Additionally, we propose a weighting strategy to emphasize the importance of time steps with rapid-change process increments. As a plug-and-play and architecture-agnostic approach, SpeeD consistently achieves 3-times acceleration across various diffusion architectures, datasets, and tasks. Notably, due to its simple design, our approach significantly reduces the cost of diffusion model training with minimal overhead. Our research enables more researchers to train diffusion models at a lower cost.
