Table of Contents
Fetching ...

PGP-DiffSR: Phase-Guided Progressive Pruning for Efficient Diffusion-based Image Super-Resolution

Zhongbao Yang, Jiangxin Dong, Yazhou Yao, Jinhui Tang, Jinshan Pan

TL;DR

The paper tackles the high computational and memory demands of diffusion-based image super-resolution by introducing PGP-DiffSR, which progressively prunes the diffusion backbone across encoder, bottleneck, and decoder while using a phase-exchange adapter to preserve structural details guided by the input phase. The approach combines a coarse-to-fine pruning strategy (PPA) with a phase-informed feature refinement (PEAM), delivering substantial FLOPs and parameter reductions without sacrificing restoration quality. A one-step diffusion variant (PGP-DiffSR-S1) further enhances efficiency, and extensive experiments on RealSR, DrealSR, and RealPhoto60 demonstrate competitive performance with significantly improved efficiency. The method provides practical gains for resource-constrained deployment and includes open-source code.

Abstract

Although diffusion-based models have achieved impressive results in image super-resolution, they often rely on large-scale backbones such as Stable Diffusion XL (SDXL) and Diffusion Transformers (DiT), which lead to excessive computational and memory costs during training and inference. To address this issue, we develop a lightweight diffusion method, PGP-DiffSR, by removing redundant information from diffusion models under the guidance of the phase information of inputs for efficient image super-resolution. We first identify the intra-block redundancy within the diffusion backbone and propose a progressive pruning approach that removes redundant blocks while reserving restoration capability. We note that the phase information of the restored images produced by the pruned diffusion model is not well estimated. To solve this problem, we propose a phase-exchange adapter module that explores the phase information of the inputs to guide the pruned diffusion model for better restoration performance. We formulate the progressive pruning approach and the phase-exchange adapter module into a unified model. Extensive experiments demonstrate that our method achieves competitive restoration quality while significantly reducing computational load and memory consumption. The code is available at https://github.com/yzb1997/PGP-DiffSR.

PGP-DiffSR: Phase-Guided Progressive Pruning for Efficient Diffusion-based Image Super-Resolution

TL;DR

The paper tackles the high computational and memory demands of diffusion-based image super-resolution by introducing PGP-DiffSR, which progressively prunes the diffusion backbone across encoder, bottleneck, and decoder while using a phase-exchange adapter to preserve structural details guided by the input phase. The approach combines a coarse-to-fine pruning strategy (PPA) with a phase-informed feature refinement (PEAM), delivering substantial FLOPs and parameter reductions without sacrificing restoration quality. A one-step diffusion variant (PGP-DiffSR-S1) further enhances efficiency, and extensive experiments on RealSR, DrealSR, and RealPhoto60 demonstrate competitive performance with significantly improved efficiency. The method provides practical gains for resource-constrained deployment and includes open-source code.

Abstract

Although diffusion-based models have achieved impressive results in image super-resolution, they often rely on large-scale backbones such as Stable Diffusion XL (SDXL) and Diffusion Transformers (DiT), which lead to excessive computational and memory costs during training and inference. To address this issue, we develop a lightweight diffusion method, PGP-DiffSR, by removing redundant information from diffusion models under the guidance of the phase information of inputs for efficient image super-resolution. We first identify the intra-block redundancy within the diffusion backbone and propose a progressive pruning approach that removes redundant blocks while reserving restoration capability. We note that the phase information of the restored images produced by the pruned diffusion model is not well estimated. To solve this problem, we propose a phase-exchange adapter module that explores the phase information of the inputs to guide the pruned diffusion model for better restoration performance. We formulate the progressive pruning approach and the phase-exchange adapter module into a unified model. Extensive experiments demonstrate that our method achieves competitive restoration quality while significantly reducing computational load and memory consumption. The code is available at https://github.com/yzb1997/PGP-DiffSR.

Paper Structure

This paper contains 14 sections, 5 equations, 6 figures, 5 tables.

Figures (6)

  • Figure 1: Effect of the pruning strategy on the image super-resolution. (a) denotes the cropped LQ patch from the red box in the input image. (b) denotes the restored image by FaithDiff faithdiff. (c) denotes the restored image by applying simple pruning strategy adcsr to FaithDiff faithdiff. (d) denotes the restored image by the proposed PGP-DiffSR. (e–h) denote phase-only reconstructions (POR) of (a–d), where POR refers to images reconstructed by inverse Fourier transform after retaining only the Fourier phase. The structures of restored images by the pruned model are distorted significantly compared to those of LQ image as shown in (e) and (g). Therefore, we explore useful structures of LQ images in the frequency domain to facilitate the pruned diffusion model for better image super-resolution.
  • Figure 2: Network architectures. (a) The proposed PGP-DiffSR mainly contains a progressive pruning approach i.e., (b), to reduce redundant blocks and a phase-exchange adapter module to explore useful structures of LQ images for better image restoration. (c) denotes the one-step version of the proposed PGP-DiffSR.
  • Figure 3: Redundancy analysis in diffusion models. (a) Visualizations of the features in the denoising UNet of diffusion model at each timestep (t=0$\to$t=19). (b) Box-plots of KL divergence of the features across adjacent timesteps in the denoising UNet. (I)–(III) indicate $\{\Psi_{\mathrm{KL}}(\mathbf{E}^{(t+1)}_{s},\mathbf{E}^{(t)}_{s})\}_{s=1,2,4}$; (IV) indicates $\Psi_{\mathrm{KL}}(\mathbf{B}^{(t+1)}_{4},\mathbf{B}^{(t)}_{4})$; (V) and (VI) indicate $\Psi_{\mathrm{KL}}(\mathbf{D}^{(t+1)}_{4},\mathbf{D}^{(t)}_{4})$ and $\Psi_{\mathrm{KL}}(\mathbf{D}^{(t+1)}_{2},\mathbf{D}^{(t)}_{2})$, respectively. (c) Box-plots of cosine dissimilarity of the features across adjacent timesteps in the the denoising UNet: (I)–(III) indicate $\{\Phi_{\mathrm{cos}}(\mathbf{E}^{(t+1)}_{s},\mathbf{E}^{(t)}_{s})\}_{s=1,2,4}$; (IV) indicates $\Phi_{\mathrm{cos}}(\mathbf{B}^{(t+1)}_{4},\mathbf{B}^{(t)}_{4})$; (V) and (VI) indicate $\Phi_{\mathrm{cos}}(\mathbf{D}^{(t+1)}_{4},\mathbf{D}^{(t)}_{4})$ and $\Phi_{\mathrm{cos}}(\mathbf{D}^{(t+1)}_{2},\mathbf{D}^{(t)}_{2})$, respectively. The yellow boxes in (a) illustrate that feature variations in the decoder of the denoising UNet remain minimal across certain consecutive timesteps, indicating the presence of redundant blocks within the decoder module.
  • Figure 4: Image SR results ($\times 2$) on the DrealSR DrealSR dataset for the PGP-DiffSR. The restored results in (b) to (g) fail to fully restore the richer textures in petal on the flower. In contrast, our method generates the images with clearer and finer textures.
  • Figure 5: Effectiveness of the proposed PPA for PGP-DiffSR on image super-resolution.
  • ...and 1 more figures