InstructRestore: Region-Customized Image Restoration with Human Instructions
Shuaizheng Liu, Jianqi Ma, Lingchen Sun, Xiangtao Kong, Lei Zhang
TL;DR
Existing diffusion-based image restoration methods apply uniform processing across the image and cannot honor region-specific human instructions. InstructRestore introduces a region-aware restoration framework that uses a ControlNet-like conditioning along with a region mask decoder, trained on a large 536,945-triplet dataset of HQ images, region masks, and region captions. Key contributions include a scalable data generation pipeline, a region-customized diffusion model, and demonstrations of localized enhancement and bokeh-preserving restoration under natural-language instructions. This work enables interactive, fine-grained image restoration with practical applications in photography and scene editing.
Abstract
Despite the significant progress in diffusion prior-based image restoration, most existing methods apply uniform processing to the entire image, lacking the capability to perform region-customized image restoration according to user instructions. In this work, we propose a new framework, namely InstructRestore, to perform region-adjustable image restoration following human instructions. To achieve this, we first develop a data generation engine to produce training triplets, each consisting of a high-quality image, the target region description, and the corresponding region mask. With this engine and careful data screening, we construct a comprehensive dataset comprising 536,945 triplets to support the training and evaluation of this task. We then examine how to integrate the low-quality image features under the ControlNet architecture to adjust the degree of image details enhancement. Consequently, we develop a ControlNet-like model to identify the target region and allocate different integration scales to the target and surrounding regions, enabling region-customized image restoration that aligns with user instructions. Experimental results demonstrate that our proposed InstructRestore approach enables effective human-instructed image restoration, such as images with bokeh effects and user-instructed local enhancement. Our work advances the investigation of interactive image restoration and enhancement techniques. Data, code, and models will be found at https://github.com/shuaizhengliu/InstructRestore.git.
