Table of Contents
Fetching ...

InstructIR: High-Quality Image Restoration Following Human Instructions

Marcos V. Conde, Gregor Geigle, Radu Timofte

TL;DR

This work presents the first approach that uses human-written instructions to guide the image restoration model, InstructIR, which can recover high-quality images from their degraded counterparts, considering multiple degradation types.

Abstract

Image restoration is a fundamental problem that involves recovering a high-quality clean image from its degraded observation. All-In-One image restoration models can effectively restore images from various types and levels of degradation using degradation-specific information as prompts to guide the restoration model. In this work, we present the first approach that uses human-written instructions to guide the image restoration model. Given natural language prompts, our model can recover high-quality images from their degraded counterparts, considering multiple degradation types. Our method, InstructIR, achieves state-of-the-art results on several restoration tasks including image denoising, deraining, deblurring, dehazing, and (low-light) image enhancement. InstructIR improves +1dB over previous all-in-one restoration methods. Moreover, our dataset and results represent a novel benchmark for new research on text-guided image restoration and enhancement. Our code, datasets and models are available at: https://github.com/mv-lab/InstructIR

InstructIR: High-Quality Image Restoration Following Human Instructions

TL;DR

This work presents the first approach that uses human-written instructions to guide the image restoration model, InstructIR, which can recover high-quality images from their degraded counterparts, considering multiple degradation types.

Abstract

Image restoration is a fundamental problem that involves recovering a high-quality clean image from its degraded observation. All-In-One image restoration models can effectively restore images from various types and levels of degradation using degradation-specific information as prompts to guide the restoration model. In this work, we present the first approach that uses human-written instructions to guide the image restoration model. Given natural language prompts, our model can recover high-quality images from their degraded counterparts, considering multiple degradation types. Our method, InstructIR, achieves state-of-the-art results on several restoration tasks including image denoising, deraining, deblurring, dehazing, and (low-light) image enhancement. InstructIR improves +1dB over previous all-in-one restoration methods. Moreover, our dataset and results represent a novel benchmark for new research on text-guided image restoration and enhancement. Our code, datasets and models are available at: https://github.com/mv-lab/InstructIR
Paper Structure (43 sections, 2 equations, 20 figures, 11 tables)

This paper contains 43 sections, 2 equations, 20 figures, 11 tables.

Figures (20)

  • Figure 1: Given an image and a prompt for how to improve that image, our all-in-one restoration model corrects the image considering the human instruction. InstructIR, can tackle various types and levels of degradation, and it is able to generalize in some real-world scenarios (last three images, from left to right).
  • Figure 2: We train our blind image restoration models using common image datasets, and prompts generated using GPT-4, note that this is (self-)supervised learning. At inference time, our model generalizes to human-written instructions and restores (or enhances) the images.
  • Figure 3: We show t-SNE plots of the text embeddings before/after training InstructIR. Each dot represents a human instruction.
  • Figure 4: Instruction Condition Block (ICB) using an approximation of task routing strezoski2019manytask for many-tasks learning (See Eq. \ref{['eq:block']}). This mechanism allows the neural network to select and prioritize specific features depending on the instruction, similarly to a Mixture of Experts (MoE).
  • Figure 5: Adversarial and OOD samples for Instruction-based Restoration. InstructIR understands a wide range of instructions for a given task (first row). Given an adversarial or out-of-distribution instruction (second row), the model does not modify the image notably (i.e. performs the identity) --we did not enforce this during training--.
  • ...and 15 more figures