Table of Contents
Fetching ...

Structural Pruning for Diffusion Models

Gongfan Fang, Xinyin Ma, Xinchao Wang

TL;DR

This paper tackles the high computational cost of diffusion models by introducing Diff-Pruning, a Taylor-expansion-based structural pruning method that selectively removes timesteps and weights from pre-trained diffusion models. By aggregating first-order loss sensitivities across pruned timesteps and gating informative steps with binary masks, the method achieves substantial FLOPs reductions while preserving generation quality and consistency, often with only 10–20% of the original training cost. Extensive experiments on CIFAR-10, CelebA-HQ, LSUN, and ImageNet-1K demonstrate efficiency gains, with ablations clarifying the roles of pruned timesteps, pruning ratios, and thresholding in maintaining fidelity (FID) and stability (SSIM). The results establish Diff-Pruning as a practical baseline for diffusion-model compression and a foundation for future improvements in scalable, consistent generative modeling.

Abstract

Generative modeling has recently undergone remarkable advancements, primarily propelled by the transformative implications of Diffusion Probabilistic Models (DPMs). The impressive capability of these models, however, often entails significant computational overhead during both training and inference. To tackle this challenge, we present Diff-Pruning, an efficient compression method tailored for learning lightweight diffusion models from pre-existing ones, without the need for extensive re-training. The essence of Diff-Pruning is encapsulated in a Taylor expansion over pruned timesteps, a process that disregards non-contributory diffusion steps and ensembles informative gradients to identify important weights. Our empirical assessment, undertaken across several datasets highlights two primary benefits of our proposed method: 1) Efficiency: it enables approximately a 50\% reduction in FLOPs at a mere 10\% to 20\% of the original training expenditure; 2) Consistency: the pruned diffusion models inherently preserve generative behavior congruent with their pre-trained models. Code is available at \url{https://github.com/VainF/Diff-Pruning}.

Structural Pruning for Diffusion Models

TL;DR

This paper tackles the high computational cost of diffusion models by introducing Diff-Pruning, a Taylor-expansion-based structural pruning method that selectively removes timesteps and weights from pre-trained diffusion models. By aggregating first-order loss sensitivities across pruned timesteps and gating informative steps with binary masks, the method achieves substantial FLOPs reductions while preserving generation quality and consistency, often with only 10–20% of the original training cost. Extensive experiments on CIFAR-10, CelebA-HQ, LSUN, and ImageNet-1K demonstrate efficiency gains, with ablations clarifying the roles of pruned timesteps, pruning ratios, and thresholding in maintaining fidelity (FID) and stability (SSIM). The results establish Diff-Pruning as a practical baseline for diffusion-model compression and a foundation for future improvements in scalable, consistent generative modeling.

Abstract

Generative modeling has recently undergone remarkable advancements, primarily propelled by the transformative implications of Diffusion Probabilistic Models (DPMs). The impressive capability of these models, however, often entails significant computational overhead during both training and inference. To tackle this challenge, we present Diff-Pruning, an efficient compression method tailored for learning lightweight diffusion models from pre-existing ones, without the need for extensive re-training. The essence of Diff-Pruning is encapsulated in a Taylor expansion over pruned timesteps, a process that disregards non-contributory diffusion steps and ensembles informative gradients to identify important weights. Our empirical assessment, undertaken across several datasets highlights two primary benefits of our proposed method: 1) Efficiency: it enables approximately a 50\% reduction in FLOPs at a mere 10\% to 20\% of the original training expenditure; 2) Consistency: the pruned diffusion models inherently preserve generative behavior congruent with their pre-trained models. Code is available at \url{https://github.com/VainF/Diff-Pruning}.
Paper Structure (25 sections, 10 equations, 5 figures, 5 tables, 1 algorithm)

This paper contains 25 sections, 10 equations, 5 figures, 5 tables, 1 algorithm.

Figures (5)

  • Figure 1: Diff-Pruning leverages Taylor expansion at pruned timesteps to estimate the importance of weights, where early steps focus on local details like edges and color and later ones pay more attention to contents such as object and shape. We propose a simple thresholding method to trade off these factors with a binary weight $\alpha_t \in \{0, 1\}$, leading to a practical algorithm for diffusion models. The generated images produced by 5%-pruned DDPMs (without post-training) are illustrated.
  • Figure 2: Generated images of the pre-trained models ho2020denoising (left) and the pruned models (right) on LSUN Church and LSUN Bedroom. SSIM measures the similarity between generated images.
  • Figure 3: Images sampled from the pruned conditional LDM on ImageNet-1K-256
  • Figure 4: Generated images of 5%-pruned models using different important criteria. We report the SSIM of batched images without post-training.
  • Figure 5: The SSIM of models pruned with different numbers of timesteps. For CIFAR-10, most of the late timesteps can be pruned safely. For CelebA-HQ, using more steps is consistently beneficial.