DVD-Quant: Data-free Video Diffusion Transformers Quantization

Zhiteng Li; Hanxuan Li; Junyi Wu; Kai Liu; Haotong Qin; Linghe Kong; Guihai Chen; Yulun Zhang; Xiaokang Yang

DVD-Quant: Data-free Video Diffusion Transformers Quantization

Zhiteng Li, Hanxuan Li, Junyi Wu, Kai Liu, Haotong Qin, Linghe Kong, Guihai Chen, Yulun Zhang, Xiaokang Yang

TL;DR

Diffusion Transformers enable high-fidelity video generation but are hindered by heavy compute and memory demands. DVD-Quant delivers a data-free PTQ framework for Video DiTs by combining Bounded-init Grid Refinement, Auto-scaling Rotated Quantization, and δ-Guided Bit Switching to reduce quantization error without calibration data while adapting bit-width across timesteps. The approach achieves approximately $2\times$ speedup over full-precision baselines, maintains visual fidelity, and uniquely enables W4A4 PTQ for video generation; it also proves compatible with cache-based acceleration like TeaCache. These results advance practical deployment of high-quality video diffusion models on resource-constrained hardware, expanding accessibility and real-time applicability.

Abstract

Diffusion Transformers (DiTs) have emerged as the state-of-the-art architecture for video generation, yet their computational and memory demands hinder practical deployment. While post-training quantization (PTQ) presents a promising approach to accelerate Video DiT models, existing methods suffer from two critical limitations: (1) dependence on computation-heavy and inflexible calibration procedures, and (2) considerable performance deterioration after quantization. To address these challenges, we propose DVD-Quant, a novel Data-free quantization framework for Video DiTs. Our approach integrates three key innovations: (1) Bounded-init Grid Refinement (BGR) and (2) Auto-scaling Rotated Quantization (ARQ) for calibration data-free quantization error reduction, as well as (3) $δ$-Guided Bit Switching ($δ$-GBS) for adaptive bit-width allocation. Extensive experiments across multiple video generation benchmarks demonstrate that DVD-Quant achieves an approximately 2$\times$ speedup over full-precision baselines on advanced DiT models while maintaining visual fidelity. Notably, DVD-Quant is the first to enable W4A4 PTQ for Video DiTs without compromising video quality. Code and models will be available at https://github.com/lhxcs/DVD-Quant.

DVD-Quant: Data-free Video Diffusion Transformers Quantization

TL;DR

Abstract

DVD-Quant: Data-free Video Diffusion Transformers Quantization

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (6)