Table of Contents
Fetching ...

DiP-GO: A Diffusion Pruner via Few-step Gradient Optimization

Haowei Zhu, Dehua Tang, Ji Liu, Mingjie Lu, Jintu Zheng, Jinzhang Peng, Dong Li, Yu Wang, Fan Jiang, Lu Tian, Spandan Tiwari, Ashish Sirasao, Jun-Hai Yong, Bin Wang, Emad Barsoum

TL;DR

This work proposes a novel pruning method that derives an efficient diffusion model via a more intelligent and differentiable pruner via a more intelligent and differentiable pruner and achieves 4.4 x speedup for SD-1.5 without any loss of accuracy.

Abstract

Diffusion models have achieved remarkable progress in the field of image generation due to their outstanding capabilities. However, these models require substantial computing resources because of the multi-step denoising process during inference. While traditional pruning methods have been employed to optimize these models, the retraining process necessitates large-scale training datasets and extensive computational costs to maintain generalization ability, making it neither convenient nor efficient. Recent studies attempt to utilize the similarity of features across adjacent denoising stages to reduce computational costs through simple and static strategies. However, these strategies cannot fully harness the potential of the similar feature patterns across adjacent timesteps. In this work, we propose a novel pruning method that derives an efficient diffusion model via a more intelligent and differentiable pruner. At the core of our approach is casting the model pruning process into a SubNet search process. Specifically, we first introduce a SuperNet based on standard diffusion via adding some backup connections built upon the similar features. We then construct a plugin pruner network and design optimization losses to identify redundant computation. Finally, our method can identify an optimal SubNet through few-step gradient optimization and a simple post-processing procedure. We conduct extensive experiments on various diffusion models including Stable Diffusion series and DiTs. Our DiP-GO approach achieves 4.4 x speedup for SD-1.5 without any loss of accuracy, significantly outperforming the previous state-of-the-art methods.

DiP-GO: A Diffusion Pruner via Few-step Gradient Optimization

TL;DR

This work proposes a novel pruning method that derives an efficient diffusion model via a more intelligent and differentiable pruner via a more intelligent and differentiable pruner and achieves 4.4 x speedup for SD-1.5 without any loss of accuracy.

Abstract

Diffusion models have achieved remarkable progress in the field of image generation due to their outstanding capabilities. However, these models require substantial computing resources because of the multi-step denoising process during inference. While traditional pruning methods have been employed to optimize these models, the retraining process necessitates large-scale training datasets and extensive computational costs to maintain generalization ability, making it neither convenient nor efficient. Recent studies attempt to utilize the similarity of features across adjacent denoising stages to reduce computational costs through simple and static strategies. However, these strategies cannot fully harness the potential of the similar feature patterns across adjacent timesteps. In this work, we propose a novel pruning method that derives an efficient diffusion model via a more intelligent and differentiable pruner. At the core of our approach is casting the model pruning process into a SubNet search process. Specifically, we first introduce a SuperNet based on standard diffusion via adding some backup connections built upon the similar features. We then construct a plugin pruner network and design optimization losses to identify redundant computation. Finally, our method can identify an optimal SubNet through few-step gradient optimization and a simple post-processing procedure. We conduct extensive experiments on various diffusion models including Stable Diffusion series and DiTs. Our DiP-GO approach achieves 4.4 x speedup for SD-1.5 without any loss of accuracy, significantly outperforming the previous state-of-the-art methods.

Paper Structure

This paper contains 22 sections, 3 equations, 7 figures, 8 tables, 1 algorithm.

Figures (7)

  • Figure 1: Overview of the SuperNet and SubNet. Standard diffusion models execute the full inference path step by step. In our framework, we propose a SuperNet based on the original flow and integrate backup connections to facilitate block removal. This allows the partial inference SubNet to efficiently eliminate redundant computational costs.
  • Figure 2: Overview of our diffusion pruner. a) DiP-GO employs a pruner network to learn the importance scores of blocks in the diffusion sampling process. It takes $N \times T$ queries as input and passes them through stacked self-attention (SA) and fully connected (FC) layers to capture the structural information in existing diffusion models. The network predicts the partial inference paths based on the $N \times T$ importance scores and is optimized by consistent and sparse loss. b) Once trained, the pruner network is discarded. We can infer the optimal partial inference path with expected computational costs via post-processing based on the predicted importance scores.
  • Figure 3: Visualization of generated images. It shows evolving patterns as pruning ratios increase. Despite these changes, main objects in the images remain consistent with the textual conditions.
  • Figure 4: Visualization of DiT model generated images: samples using DDIM-250 steps (uplink) and pruned 60% MACs (downlink). The speedup ratio here is $2.4 \times$ .
  • Figure 5: A qualitative comparison with existing methods is provided. We compare our method (prune 0.75) with DeepCache (N=4).
  • ...and 2 more figures