High Noise Scheduling is a Must
Mahmut S. Gokmen, Cody Bumgardner, Jie Zhang, Ge Wang, Jin Chen
TL;DR
This paper addresses denoising efficiency in consistency models by balancing noise distributions during training. It introduces a polynomial noise scheduling method with a predefined Karras-like noise vector and a sinusoidal curriculum to stabilize noise progression across discretization steps. Empirical results on CIFAR-10 show that the polynomial schedule, particularly when paired with the sinusoidal curriculum, yields substantial improvements in FID (e.g., 30.48) compared with log-normal baselines and prior curricula. The approach enhances denoising performance and training stability, suggesting broad applicability to fast, single-step diffusion-like sampling methods.
Abstract
Consistency models possess high capabilities for image generation, advancing sampling steps to a single step through their advanced techniques. Current advancements move one step forward consistency training techniques and eliminates the limitation of distillation training. Even though the proposed curriculum and noise scheduling in improved training techniques yield better results than basic consistency models, it lacks well balanced noise distribution and its consistency between curriculum. In this study, it is investigated the balance between high and low noise levels in noise distribution and offered polynomial noise distribution to maintain the stability. This proposed polynomial noise distribution is also supported with a predefined Karras noises to prevent unique noise levels arises with Karras noise generation algorithm. Furthermore, by elimination of learned noisy steps with a curriculum based on sinusoidal function increase the performance of the model in denoising. To make a fair comparison with the latest released consistency model training techniques, experiments are conducted with same hyper-parameters except curriculum and noise distribution. The models utilized during experiments are determined with low depth to prove the robustness of our proposed technique. The results show that the polynomial noise distribution outperforms the model trained with log-normal noise distribution, yielding a 33.54 FID score after 100,000 training steps with constant discretization steps. Additionally, the implementation of a sinusoidal-based curriculum enhances denoising performance, resulting in a FID score of 30.48.
