Enhancing Image Restoration through Learning Context-Rich and Detail-Accurate Features
Hu Gao, Depeng Dang
TL;DR
This work tackles image restoration as an ill-posed problem requiring a balance between spatial details and contextual information, particularly under frequency-domain variations. It proposes LCDNet, a multi-scale encoder–decoder that jointly learns spatial and frequency-domain cues using a Hybrid Scale Frequency Selection Block (HSFSBlock) and a Skip Connection Attention Mechanism (SCAM), with a coarse-to-fine training regime. Key contributions include the architecture itself, the Multi Scale Spatial Feature Block (MSSFBlock), the Multi-branch Selective Frequency Module (MSFM), and the SCAM, all trained with a joint spatial-frequency loss $L = L_s + \lambda L_f$ where $L_s$ aggregates per-scale spatial errors and $L_f$ enforces frequency-domain fidelity via $\mathcal{F}$. Empirical results across motion deblurring, defocus deblurring, and deraining demonstrate that LCDNet achieves state-of-the-art or competitive performance while reducing computational cost, and exhibits strong generalization to unseen datasets, signaling practical impact for real-world restoration tasks. The method advances frequency-aware restoration by enabling selective frequency reconstruction and noise-robust skip connections, essential for reliable high-quality outputs.
Abstract
Image restoration involves recovering high-quality images from their corrupted versions, requiring a nuanced balance between spatial details and contextual information. While certain methods address this balance, they predominantly emphasize spatial aspects, neglecting frequency variation comprehension. In this paper, we present a multi-scale design that optimally balances these competing objectives, seamlessly integrating spatial and frequency domain knowledge to selectively recover the most informative information. Specifically, we develop a hybrid scale frequency selection block (HSFSBlock), which not only captures multi-scale information from the spatial domain, but also selects the most informative components for image restoration in the frequency domain. Furthermore, to mitigate the inherent noise introduced by skip connections employing only addition or concatenation, we introduce a skip connection attention mechanism (SCAM) to selectively determines the information that should propagate through skip connections. The resulting tightly interlinked architecture, named as LCDNet. Extensive experiments conducted across diverse image restoration tasks showcase that our model attains performance levels that are either superior or comparable to those of state-of-the-art algorithms.
