A Simple Early Exiting Framework for Accelerated Sampling in Diffusion Models
Taehong Moon, Moonseok Choi, EungGu Yun, Jongmin Yoon, Gayoung Lee, Jaewoong Cho, Juho Lee
TL;DR
Adaptive Score Estimation (ASE) introduces a time-varying early exiting framework for diffusion models to accelerate sampling by adaptively skipping portions of the score estimation network across time steps. By leveraging the observation that score estimation difficulty varies with diffusion time, ASE employs a time-dependent block-dropping schedule and a targeted fine-tuning strategy using EMA and time-weighted losses to maintain sample quality. Empirical results on DiT and U-ViT backbones show 25–30% faster sampling with comparable FID across multiple solvers and even extend to large-scale text-to-image generation with modest data and acceleration. The approach is architecture-agnostic and solver-compatible, offering a practical pathway to deploy faster diffusion-based generators in real-world settings, with the main limitation being the manually designed dropping schedules.
Abstract
Diffusion models have shown remarkable performance in generation problems over various domains including images, videos, text, and audio. A practical bottleneck of diffusion models is their sampling speed, due to the repeated evaluation of score estimation networks during the inference. In this work, we propose a novel framework capable of adaptively allocating compute required for the score estimation, thereby reducing the overall sampling time of diffusion models. We observe that the amount of computation required for the score estimation may vary along the time step for which the score is estimated. Based on this observation, we propose an early-exiting scheme, where we skip the subset of parameters in the score estimation network during the inference, based on a time-dependent exit schedule. Using the diffusion models for image synthesis, we show that our method could significantly improve the sampling throughput of the diffusion models without compromising image quality. Furthermore, we also demonstrate that our method seamlessly integrates with various types of solvers for faster sampling, capitalizing on their compatibility to enhance overall efficiency. The source code and our experiments are available at \url{https://github.com/taehong-moon/ee-diffusion}
