Seeing the Unseen: A Frequency Prompt Guided Transformer for Image Restoration
Shihao Zhou, Jinshan Pan, Jinglei Shi, Duosheng Chen, Lishen Qu, Jufeng Yang
TL;DR
Seeing the Unseen introduces FPro, a frequency prompt guided transformer for image restoration that decouples input features into low- and high-frequency components using a gated dynamic decoupler and then injects frequency cues through a dual prompt block with a low-frequency prompt modulator and a high-frequency prompt modulator. The method leverages Fourier-domain interactions and window-based attention to propagate frequency prompts and guide restoration, achieving state-of-the-art results on five tasks including deraining, deraindrop, demoiréing, deblurring, and dehazing, while maintaining competitive efficiency. Ablation studies confirm the distinct benefits of the GDD and both DPB components, and comparisons with spatial-prompt baselines demonstrate substantial gains from the frequency-oriented design. The work highlights frequency-domain prompting as a robust alternative to spatial prompts, with practical impact for robust, detail-preserving image restoration in varied degradation scenarios, and the authors provide code and pre-trained models to facilitate adoption. $\hat{I}=I+R$ and $FFT$-based processing are used to illustrate the core mathematical machinery enabling the approach. $LPM$ and $HPM$ collaboratively encode global and local frequency cues to enhance restoration fidelity across multiple tasks.
Abstract
How to explore useful features from images as prompts to guide the deep image restoration models is an effective way to solve image restoration. In contrast to mining spatial relations within images as prompt, which leads to characteristics of different frequencies being neglected and further remaining subtle or undetectable artifacts in the restored image, we develop a Frequency Prompting image restoration method, dubbed FPro, which can effectively provide prompt components from a frequency perspective to guild the restoration model address these differences. Specifically, we first decompose input features into separate frequency parts via dynamically learned filters, where we introduce a gating mechanism for suppressing the less informative elements within the kernels. To propagate useful frequency information as prompt, we then propose a dual prompt block, consisting of a low-frequency prompt modulator (LPM) and a high-frequency prompt modulator (HPM), to handle signals from different bands respectively. Each modulator contains a generation process to incorporate prompting components into the extracted frequency maps, and a modulation part that modifies the prompt feature with the guidance of the decoder features. Experimental results on commonly used benchmarks have demonstrated the favorable performance of our pipeline against SOTA methods on 5 image restoration tasks, including deraining, deraindrop, demoiréing, deblurring, and dehazing. The source code and pre-trained models will be available at https://github.com/joshyZhou/FPro.
