Table of Contents
Fetching ...

Self-Adaptive Reality-Guided Diffusion for Artifact-Free Super-Resolution

Qingping Zheng, Ling Zheng, Yuanfan Guo, Ying Li, Songcen Xu, Jiankang Deng, Hang Xu

TL;DR

The paper tackles artifact generation in diffusion-based super-resolution by introducing SARGD, a training-free framework that combines Reality-Guided Refinement (RGR) with Self-Adaptive Guidance (SAG). RGR uses an artifact detector to identify anomalous latent regions and refines them using a realistic latent, while SAG dynamically updates the guidance latent to counter over-smoothing. The joint operation yields artifact-free, high-fidelity SR and reduces inference steps by a factor of $2\times$, demonstrating superior quantitative and perceptual performance over prior diffusion-based SR methods. This approach offers a practical, faster, and more reliable path to artifact-free SR suitable for real-world deployment.

Abstract

Artifact-free super-resolution (SR) aims to translate low-resolution images into their high-resolution counterparts with a strict integrity of the original content, eliminating any distortions or synthetic details. While traditional diffusion-based SR techniques have demonstrated remarkable abilities to enhance image detail, they are prone to artifact introduction during iterative procedures. Such artifacts, ranging from trivial noise to unauthentic textures, deviate from the true structure of the source image, thus challenging the integrity of the super-resolution process. In this work, we propose Self-Adaptive Reality-Guided Diffusion (SARGD), a training-free method that delves into the latent space to effectively identify and mitigate the propagation of artifacts. Our SARGD begins by using an artifact detector to identify implausible pixels, creating a binary mask that highlights artifacts. Following this, the Reality Guidance Refinement (RGR) process refines artifacts by integrating this mask with realistic latent representations, improving alignment with the original image. Nonetheless, initial realistic-latent representations from lower-quality images result in over-smoothing in the final output. To address this, we introduce a Self-Adaptive Guidance (SAG) mechanism. It dynamically computes a reality score, enhancing the sharpness of the realistic latent. These alternating mechanisms collectively achieve artifact-free super-resolution. Extensive experiments demonstrate the superiority of our method, delivering detailed artifact-free high-resolution images while reducing sampling steps by 2X. We release our code at https://github.com/ProAirVerse/Self-Adaptive-Guidance-Diffusion.git.

Self-Adaptive Reality-Guided Diffusion for Artifact-Free Super-Resolution

TL;DR

The paper tackles artifact generation in diffusion-based super-resolution by introducing SARGD, a training-free framework that combines Reality-Guided Refinement (RGR) with Self-Adaptive Guidance (SAG). RGR uses an artifact detector to identify anomalous latent regions and refines them using a realistic latent, while SAG dynamically updates the guidance latent to counter over-smoothing. The joint operation yields artifact-free, high-fidelity SR and reduces inference steps by a factor of , demonstrating superior quantitative and perceptual performance over prior diffusion-based SR methods. This approach offers a practical, faster, and more reliable path to artifact-free SR suitable for real-world deployment.

Abstract

Artifact-free super-resolution (SR) aims to translate low-resolution images into their high-resolution counterparts with a strict integrity of the original content, eliminating any distortions or synthetic details. While traditional diffusion-based SR techniques have demonstrated remarkable abilities to enhance image detail, they are prone to artifact introduction during iterative procedures. Such artifacts, ranging from trivial noise to unauthentic textures, deviate from the true structure of the source image, thus challenging the integrity of the super-resolution process. In this work, we propose Self-Adaptive Reality-Guided Diffusion (SARGD), a training-free method that delves into the latent space to effectively identify and mitigate the propagation of artifacts. Our SARGD begins by using an artifact detector to identify implausible pixels, creating a binary mask that highlights artifacts. Following this, the Reality Guidance Refinement (RGR) process refines artifacts by integrating this mask with realistic latent representations, improving alignment with the original image. Nonetheless, initial realistic-latent representations from lower-quality images result in over-smoothing in the final output. To address this, we introduce a Self-Adaptive Guidance (SAG) mechanism. It dynamically computes a reality score, enhancing the sharpness of the realistic latent. These alternating mechanisms collectively achieve artifact-free super-resolution. Extensive experiments demonstrate the superiority of our method, delivering detailed artifact-free high-resolution images while reducing sampling steps by 2X. We release our code at https://github.com/ProAirVerse/Self-Adaptive-Guidance-Diffusion.git.
Paper Structure (28 sections, 7 equations, 7 figures, 5 tables, 1 algorithm)

This paper contains 28 sections, 7 equations, 7 figures, 5 tables, 1 algorithm.

Figures (7)

  • Figure 1: Visual comparison of artifact variability between (a) StableSR and (b) Our SARGD method. In the lower panel, the left image is the decoded output, while the right image is the artifact mask. Regions containing artifacts are highlighted in red, utilizing a binary map to serve as the artifact mask. Our SARGD method exhibits a superior capability in reducing artifacts.
  • Figure 2: Self-Adaptive Reality-Guided Diffusion (SARGD) for artifact-free super-resolution. Our proposed SARGD is a training-free approach that consists of two principal components: 1) a Reality-Guided Refinement (RGR) that identifies and corrects artifacts within the latent representation by using a realistic latent as a guide to maintain the inherent details of the original image during the diffusion process, and 2) a Self-Adaptive Guidance (SAG) mechanism that enhances the fidelity of the initial realistic latent guidance, derived from upscaled low-resolution images, thereby effectively addressing the issue of over-smoothing in the final outputs.
  • Figure 3: Overview of the Reality-Guided Refinement (RGR) workflow: 1) Decoding the current latent into an RGB image and using an artifact detector to create a binary mask identifying areas with artifacts; and 2) Utilizing realistic latent guidance to refine the masked regions, enhancing the image's fidelity and authenticity.
  • Figure 4: Visual comparison with diffusion-SR methods for $\times 2$, $\times 3$, and $\times 4$ super-resolution, including (a) Bicubic upsampling, (b) StableSR, and (c) our SARGD. The red solid-lined boxes represent the ground truth (GT), focusing on regions zoomed for detailed inspection. The green boxes illustrate how our SARGD method preserves significantly more detail and clarity compared to the alternatives.
  • Figure 5: Visual comparison of SARGD components. Our SARGD exhibits the best outcomes for $\times 3$ super-resolution.
  • ...and 2 more figures