Table of Contents
Fetching ...

D$^2$-DPM: Dual Denoising for Quantized Diffusion Probabilistic Models

Qian Zeng, Jie Song, Han Zheng, Hao Jiang, Mingli Song

TL;DR

D$^2$-DPM addresses the challenge of deploying diffusion models under post-training quantization by introducing a dual denoising framework that separately corrects mean drift and variance inflation induced by quantization noise. The method builds a time-step-aware Gaussian model of the quantization noise and outputs, and offers two practical variants—stochastic dual denoising (S-D$^2$) and deterministic dual denoising (D$^2$-D2)—to denoise at each inverse diffusion step. Empirical results on ImageNet and LSUN show substantial fidelity gains and compression/acceleration improvements over prior PTQ approaches, with ablations confirming the effectiveness of mean and variance corrections. The approach enables high-fidelity, low-cost diffusion-model deployment and suggests broad applicability to other domains such as video, text, and molecular design.

Abstract

Diffusion models have achieved cutting-edge performance in image generation. However, their lengthy denoising process and computationally intensive score estimation network impede their scalability in low-latency and resource-constrained scenarios. Post-training quantization (PTQ) compresses and accelerates diffusion models without retraining, but it inevitably introduces additional quantization noise, resulting in mean and variance deviations. In this work, we propose D2-DPM, a dual denoising mechanism aimed at precisely mitigating the adverse effects of quantization noise on the noise estimation network. Specifically, we first unravel the impact of quantization noise on the sampling equation into two components: the mean deviation and the variance deviation. The mean deviation alters the drift coefficient of the sampling equation, influencing the trajectory trend, while the variance deviation magnifies the diffusion coefficient, impacting the convergence of the sampling trajectory. The proposed D2-DPM is thus devised to denoise the quantization noise at each time step, and then denoise the noisy sample through the inverse diffusion iterations. Experimental results demonstrate that D2-DPM achieves superior generation quality, yielding a 1.42 lower FID than the full-precision model while achieving 3.99x compression and 11.67x bit-operation acceleration.

D$^2$-DPM: Dual Denoising for Quantized Diffusion Probabilistic Models

TL;DR

D-DPM addresses the challenge of deploying diffusion models under post-training quantization by introducing a dual denoising framework that separately corrects mean drift and variance inflation induced by quantization noise. The method builds a time-step-aware Gaussian model of the quantization noise and outputs, and offers two practical variants—stochastic dual denoising (S-D) and deterministic dual denoising (D-D2)—to denoise at each inverse diffusion step. Empirical results on ImageNet and LSUN show substantial fidelity gains and compression/acceleration improvements over prior PTQ approaches, with ablations confirming the effectiveness of mean and variance corrections. The approach enables high-fidelity, low-cost diffusion-model deployment and suggests broad applicability to other domains such as video, text, and molecular design.

Abstract

Diffusion models have achieved cutting-edge performance in image generation. However, their lengthy denoising process and computationally intensive score estimation network impede their scalability in low-latency and resource-constrained scenarios. Post-training quantization (PTQ) compresses and accelerates diffusion models without retraining, but it inevitably introduces additional quantization noise, resulting in mean and variance deviations. In this work, we propose D2-DPM, a dual denoising mechanism aimed at precisely mitigating the adverse effects of quantization noise on the noise estimation network. Specifically, we first unravel the impact of quantization noise on the sampling equation into two components: the mean deviation and the variance deviation. The mean deviation alters the drift coefficient of the sampling equation, influencing the trajectory trend, while the variance deviation magnifies the diffusion coefficient, impacting the convergence of the sampling trajectory. The proposed D2-DPM is thus devised to denoise the quantization noise at each time step, and then denoise the noisy sample through the inverse diffusion iterations. Experimental results demonstrate that D2-DPM achieves superior generation quality, yielding a 1.42 lower FID than the full-precision model while achieving 3.99x compression and 11.67x bit-operation acceleration.
Paper Structure (18 sections, 18 equations, 2 figures, 5 tables, 1 algorithm)

This paper contains 18 sections, 18 equations, 2 figures, 5 tables, 1 algorithm.

Figures (2)

  • Figure 1: Comparison of generated samples on the ImageNet 256$\times$256 between full-precision LDM-4 and its quantized versions using PTQ4DM, PTQD, and our proposed D$^2$-DPM (comprising two variants, S-D$^2$ and D-D$^2$).
  • Figure 2: The statistical characteristics of $\Delta\bm{\epsilon}_{\theta}$ and $\bm{\hat{\epsilon}}_{\theta}$ on quantifying full-precision LDM-4 rombach2022high to W4A8 (4-bit for weights, 8-bit for activations) LDM-4. (a) The statistical distribution of the $3^{rd}$ element of $\Delta\bm{\epsilon}_{\theta}^{(0.5T)}$. (b) The statistical distribution of the $5^{th}$ element of $\bm{\hat{\epsilon}}_{\theta}^{(0.5T)}$. (c) The probability density heatmap for element set of $\left(\bm{\hat{\epsilon}_{\theta}^{(0.5T)}}, \Delta\bm{\epsilon_{\theta}^{(0.5T)}}\right)$.