Timestep-Aware Correction for Quantized Diffusion Models
Yuzhe Yao, Feng Tian, Jun Chen, Haonan Lin, Guang Dai, Yong Liu, Jingdong Wang
TL;DR
Diffusion models deliver high-fidelity images but are computationally intensive, and post-training quantization (PTQ) introduces error accumulation that degrades quality. The authors propose TAC-Diffusion, a timestep-aware correction framework that dynamically mitigates quantization errors during diffusion denoising via Noise Estimation Reconstruction (NER) and Input Bias Correction (IBC), without additional training. They derive a convex, closed-form solution for per-timestep correction coefficients and use a masked loss with relative distortion (rQNSR) to reconstruct noise estimates, while correcting input bias per timestep. Extensive experiments across CIFAR-10, LSUN, and Stable Diffusion show TAC-Diffusion achieves state-of-the-art performance among low-precision diffusion models, significantly narrowing the gap with full-precision models and enabling efficient deployment on resource-constrained devices.
Abstract
Diffusion models have marked a significant breakthrough in the synthesis of semantically coherent images. However, their extensive noise estimation networks and the iterative generation process limit their wider application, particularly on resource-constrained platforms like mobile devices. Existing post-training quantization (PTQ) methods have managed to compress diffusion models to low precision. Nevertheless, due to the iterative nature of diffusion models, quantization errors tend to accumulate throughout the generation process. This accumulation of error becomes particularly problematic in low-precision scenarios, leading to significant distortions in the generated images. We attribute this accumulation issue to two main causes: error propagation and exposure bias. To address these problems, we propose a timestep-aware correction method for quantized diffusion model, which dynamically corrects the quantization error. By leveraging the proposed method in low-precision diffusion models, substantial enhancement of output quality could be achieved with only negligible computation overhead. Extensive experiments underscore our method's effectiveness and generalizability. By employing the proposed correction strategy, we achieve state-of-the-art (SOTA) results on low-precision models.
