Improved Immiscible Diffusion: Accelerate Diffusion Training by Reducing Its Miscibility
Yiheng Li, Feng Liang, Dan Kondratyuk, Masayoshi Tomizuka, Kurt Keutzer, Chenfeng Xu
TL;DR
This paper identifies trajectory miscibility as a core bottleneck in diffusion model training and proposes immiscible diffusion as a broad, implementation-agnostic approach to reduce mixing of diffusion trajectories across images. It provides theoretical and empirical evidence that denoising remains stable and diverse under immiscible diffusion, and introduces practical implementations—such as KNN noise selection and image scaling—to realize large speedups. Across unconditional and conditional image generation, image editing, and robotics planning tasks, immiscible diffusion yields up to >4x faster training while preserving quality and prompt fidelity. The work also connects optimal transport concepts to diffusion training, broadening the perspective on how to design high-efficiency diffusion systems and suggesting new directions for future research.
Abstract
The substantial training cost of diffusion models hinders their deployment. Immiscible Diffusion recently showed that reducing diffusion trajectory mixing in the noise space via linear assignment accelerates training by simplifying denoising. To extend immiscible diffusion beyond the inefficient linear assignment under high batch sizes and high dimensions, we refine this concept to a broader miscibility reduction at any layer and by any implementation. Specifically, we empirically demonstrate the bijective nature of the denoising process with respect to immiscible diffusion, ensuring its preservation of generative diversity. Moreover, we provide thorough analysis and show step-by-step how immiscibility eases denoising and improves efficiency. Extending beyond linear assignment, we propose a family of implementations including K-nearest neighbor (KNN) noise selection and image scaling to reduce miscibility, achieving up to >4x faster training across diverse models and tasks including unconditional/conditional generation, image editing, and robotics planning. Furthermore, our analysis of immiscibility offers a novel perspective on how optimal transport (OT) enhances diffusion training. By identifying trajectory miscibility as a fundamental bottleneck, we believe this work establishes a potentially new direction for future research into high-efficiency diffusion training. The code is available at https://github.com/yhli123/Immiscible-Diffusion.
