Table of Contents
Fetching ...

ProRes: Exploring Degradation-aware Visual Prompt for Universal Image Restoration

Jiaqi Ma, Tianheng Cheng, Guoli Wang, Qian Zhang, Xinggang Wang, Lefei Zhang

TL;DR

The paper tackles the lack of universal, controllable image restoration across diverse degradations by proposing degradation-aware visual prompts. It integrates prompts into a ViT-based ProRes system, enabling control over restoration and efficient adaptation via prompt tuning. Experiments on denoising, deraining, low-light enhancement, and deblurring show ProRes is competitive with task-specific methods and outperforms other universal models, with clear demonstration of prompt-based transfer and controllability. This work provides a simple, effective baseline for universal image restoration and highlights the potential of prompt-based conditioning in low-level vision.

Abstract

Image restoration aims to reconstruct degraded images, e.g., denoising or deblurring. Existing works focus on designing task-specific methods and there are inadequate attempts at universal methods. However, simply unifying multiple tasks into one universal architecture suffers from uncontrollable and undesired predictions. To address those issues, we explore prompt learning in universal architectures for image restoration tasks. In this paper, we present Degradation-aware Visual Prompts, which encode various types of image degradation, e.g., noise and blur, into unified visual prompts. These degradation-aware prompts provide control over image processing and allow weighted combinations for customized image restoration. We then leverage degradation-aware visual prompts to establish a controllable and universal model for image restoration, called ProRes, which is applicable to an extensive range of image restoration tasks. ProRes leverages the vanilla Vision Transformer (ViT) without any task-specific designs. Furthermore, the pre-trained ProRes can easily adapt to new tasks through efficient prompt tuning with only a few images. Without bells and whistles, ProRes achieves competitive performance compared to task-specific methods and experiments can demonstrate its ability for controllable restoration and adaptation for new tasks. The code and models will be released in \url{https://github.com/leonmakise/ProRes}.

ProRes: Exploring Degradation-aware Visual Prompt for Universal Image Restoration

TL;DR

The paper tackles the lack of universal, controllable image restoration across diverse degradations by proposing degradation-aware visual prompts. It integrates prompts into a ViT-based ProRes system, enabling control over restoration and efficient adaptation via prompt tuning. Experiments on denoising, deraining, low-light enhancement, and deblurring show ProRes is competitive with task-specific methods and outperforms other universal models, with clear demonstration of prompt-based transfer and controllability. This work provides a simple, effective baseline for universal image restoration and highlights the potential of prompt-based conditioning in low-level vision.

Abstract

Image restoration aims to reconstruct degraded images, e.g., denoising or deblurring. Existing works focus on designing task-specific methods and there are inadequate attempts at universal methods. However, simply unifying multiple tasks into one universal architecture suffers from uncontrollable and undesired predictions. To address those issues, we explore prompt learning in universal architectures for image restoration tasks. In this paper, we present Degradation-aware Visual Prompts, which encode various types of image degradation, e.g., noise and blur, into unified visual prompts. These degradation-aware prompts provide control over image processing and allow weighted combinations for customized image restoration. We then leverage degradation-aware visual prompts to establish a controllable and universal model for image restoration, called ProRes, which is applicable to an extensive range of image restoration tasks. ProRes leverages the vanilla Vision Transformer (ViT) without any task-specific designs. Furthermore, the pre-trained ProRes can easily adapt to new tasks through efficient prompt tuning with only a few images. Without bells and whistles, ProRes achieves competitive performance compared to task-specific methods and experiments can demonstrate its ability for controllable restoration and adaptation for new tasks. The code and models will be released in \url{https://github.com/leonmakise/ProRes}.
Paper Structure (29 sections, 8 figures, 7 tables)

This paper contains 29 sections, 8 figures, 7 tables.

Figures (8)

  • Figure 1: We show the visualization results processed by ProRes from images of mixed types of degradation, i.e., low-light and rainy. ProRes adopts two visual prompts for low-light enhancement (E) and deraining (D) and combines the two visual prompts by linear weighted sum, i.e., $\alpha \textrm{D} + (1-\alpha) \textrm{E}$, to control the restoration process.
  • Figure 2: Conceptual comparison with previous approaches. (a) Task-specific models design specialized architectures and strategies for different tasks, e.g., Model-A for low-light enhancement and Model-B for image denoising. (b) Multi-task models adopt a shared backbone for image feature extraction and leverage multiple task-specific heads for different tasks. (c) Universal models adopt mixed inputs without any task-specific indicator. (d) The proposed ProRes adopts input images with degradation-aware visual prompts for specific targets.
  • Figure 3: Overall Pipeline of ProRes.(a) Training ProRes: we add the target visual prompt to the input image and flatten the prompted image into patches. We leverage a vision transformer, i.e., ViT-Large, as the image encoder and adopt a simple pixel decoder to generate the restored image. Then we adopt pixel loss to optimize ProRes. (b) Prompt Tuning: we freeze the weights of ProRes and randomly initialize the learnable prompts for new tasks or new datasets.
  • Figure 4: Visualization results processed from images of different corruptions. Compared with the original inputs, the outputs are consistent with the given visual prompts.
  • Figure 5: Visualization results processed by different prompts. Compared with the original inputs, the outputs remain unchanged with irrelevant visual prompts.
  • ...and 3 more figures