PatchScaler: An Efficient Patch-Independent Diffusion Model for Image Super-Resolution

Yong Liu; Hang Dong; Jinshan Pan; Qingji Dong; Kai Chen; Rongxiang Zhang; Lean Fu; Fei Wang

PatchScaler: An Efficient Patch-Independent Diffusion Model for Image Super-Resolution

Yong Liu, Hang Dong, Jinshan Pan, Qingji Dong, Kai Chen, Rongxiang Zhang, Lean Fu, Fei Wang

TL;DR

PatchScaler introduces a Patch-adaptive Group Sampling (PGS) strategy that groups feature patches by quantifying their reconstruction difficulty and establishes shortcut paths with different sampling configurations for each group to further optimize the patch-level reconstruction process of PGS.

Abstract

While diffusion models significantly improve the perceptual quality of super-resolved images, they usually require a large number of sampling steps, resulting in high computational costs and long inference times. Recent efforts have explored reasonable acceleration schemes by reducing the number of sampling steps. However, these approaches treat all regions of the image equally, overlooking the fact that regions with varying levels of reconstruction difficulty require different sampling steps. To address this limitation, we propose PatchScaler, an efficient patch-independent diffusion pipeline for single image super-resolution. Specifically, PatchScaler introduces a Patch-adaptive Group Sampling (PGS) strategy that groups feature patches by quantifying their reconstruction difficulty and establishes shortcut paths with different sampling configurations for each group. To further optimize the patch-level reconstruction process of PGS, we propose a texture prompt that provides rich texture conditional information to the diffusion model. The texture prompt adaptively retrieves texture priors for the target patch from a common reference texture memory. Extensive experiments show that our PatchScaler achieves superior performance in both quantitative and qualitative evaluations, while significantly speeding up inference. Our code will be available at \url{https://github.com/yongliuy/PatchScaler}.

PatchScaler: An Efficient Patch-Independent Diffusion Model for Image Super-Resolution

TL;DR

Abstract

Paper Structure (13 sections, 7 equations, 7 figures, 4 tables, 1 algorithm)

This paper contains 13 sections, 7 equations, 7 figures, 4 tables, 1 algorithm.

Introduction
Related Work
Proposed Method
Overall Architecture
Patch-adaptive Group Sampling
Texture Prompt
Experiments
Experimental Settings
Comparisons with State-of-the-Art Methods
Evaluations of Patch-adaptive Group Sampling
Text Prompt vs. Texture Prompt
Discussion
Conclusion

Figures (7)

Figure 1: (a) Qualitative analysis of unified sampling and adaptive sampling. (b) Quantitative comparison of diffusion-based SR methods on RealSR cai2019toward dataset. Noted that the runtime is measured on the $\times$4 (512 $\to$2048) SR task using an NVIDIA Tesla A100 GPU.
Figure 2: Overview of the proposed PatchScaler. PGS dynamically assign feature patches into groups with different sampling configuration based on quantified confidence map. Moreover, the texture prompt provides high-quality conditional information for Patch-DiT by retrieving high-quality texture priors from universal RTM.
Figure 3: Examples of coarse HR images and corresponding $Qmap$. Our approach can accurately quantify the reconstruction difficulty of different regions across diverse scenes.
Figure 4: Illustration of the proposed PGS, which establishes a shortcut path between $\boldsymbol{x}_{0}$ and $\boldsymbol{y}_{0}$. PGS treats different patches discriminatively and dynamically assign different sampling configurations. Here, $\boldsymbol{y}_{0}$ and $\boldsymbol{x}_{0}$ denote the coarse HR patch and ground truth, respectively. $f_{\theta }$ denotes the diffusion model Patch-DiT.
Figure 5: Visual comparisons of state-of-the-art SR methods on real-world low-resolution images.
...and 2 more figures

PatchScaler: An Efficient Patch-Independent Diffusion Model for Image Super-Resolution

TL;DR

Abstract

PatchScaler: An Efficient Patch-Independent Diffusion Model for Image Super-Resolution

Authors

TL;DR

Abstract

Table of Contents

Figures (7)