Prompt-SID: Learning Structural Representation Prompt via Latent Diffusion for Single-Image Denoising
Huaqiu Li, Wang Zhang, Xiaowan Hu, Tao Jiang, Zikang Chen, Haoqian Wang
TL;DR
Prompt-SID tackles the challenge of self-supervised single-image denoising by learning a structural representation prompt that preserves high-frequency details during downsampling. It leverages RG-Diff, a latent diffusion process, to generate a structural prompt from degraded inputs and fuses this prompt into restoration via a Structural Attention Module within a Transformer denoiser. A scale replay mechanism aligns downsampled and original-scale restorations during training, improving cross-scale generalization. Across synthetic, real-world, and fluorescence imaging datasets, Prompt-SID delivers state-of-the-art performance among self-supervised methods and competitive results against supervised baselines, with proven improvements in detail preservation and edge fidelity.
Abstract
Many studies have concentrated on constructing supervised models utilizing paired datasets for image denoising, which proves to be expensive and time-consuming. Current self-supervised and unsupervised approaches typically rely on blind-spot networks or sub-image pairs sampling, resulting in pixel information loss and destruction of detailed structural information, thereby significantly constraining the efficacy of such methods. In this paper, we introduce Prompt-SID, a prompt-learning-based single image denoising framework that emphasizes preserving of structural details. This approach is trained in a self-supervised manner using downsampled image pairs. It captures original-scale image information through structural encoding and integrates this prompt into the denoiser. To achieve this, we propose a structural representation generation model based on the latent diffusion process and design a structural attention module within the transformer-based denoiser architecture to decode the prompt. Additionally, we introduce a scale replay training mechanism, which effectively mitigates the scale gap from images of different resolutions. We conduct comprehensive experiments on synthetic, real-world, and fluorescence imaging datasets, showcasing the remarkable effectiveness of Prompt-SID. Our code will be released at https://github.com/huaqlili/Prompt-SID.
