Table of Contents
Fetching ...

Exploring Richer and More Accurate Information via Frequency Selection for Image Restoration

Hu Gao, Depeng Dang

TL;DR

This work tackles image restoration by bridging spatial and frequency-domain information to better recover degraded images. It introduces MSFSNet, a four-scale encoder-decoder that employs two plug-in modules: Dynamic Filter Selection Module (DFS) to dynamically generate low- and high-frequency maps via learnable filters and a Frequency Cross-Attention Mechanism (FCAM) to select the most informative frequencies, and Skip Feature Fusion Block (SFF) to selectively propagate useful skip-connection information. The network is trained with a joint spatial and Fourier-domain loss, L = L_s + \lambda L_f, where L_s captures pixel-wise differences and L_f enforces fidelity in the frequency domain via the Fourier transform, with \lambda controlling the balance. Across image motion deblurring, defocus deblurring, deraining, and denoising, MSFSNet delivers state-of-the-art or competitive results while achieving notable efficiency, including substantial MACs reduction in deraining and improved PSNR over strong baselines on multiple datasets. The plug-in nature of DFS and SFF enables these frequency-aware enhancements to be readily integrated into existing restoration networks to boost multi-scale feature quality and robustness.

Abstract

Image restoration aims to recover high-quality images from their corrupted counterparts. Many existing methods primarily focus on the spatial domain, neglecting the understanding of frequency variations and ignoring the impact of implicit noise in skip connections. In this paper, we introduce a multi-scale frequency selection network (MSFSNet) that seamlessly integrates spatial and frequency domain knowledge, selectively recovering richer and more accurate information. Specifically, we initially capture spatial features and input them into dynamic filter selection modules (DFS) at different scales to integrate frequency knowledge. DFS utilizes learnable filters to generate high and low-frequency information and employs a frequency cross-attention mechanism (FCAM) to determine the most information to recover. To learn a multi-scale and accurate set of hybrid features, we develop a skip feature fusion block (SFF) that leverages contextual features to discriminatively determine which information should be propagated in skip-connections. It is worth noting that our DFS and SFF are generic plug-in modules that can be directly employed in existing networks without any adjustments, leading to performance improvements. Extensive experiments across various image restoration tasks demonstrate that our MSFSNet achieves performance that is either superior or comparable to state-of-the-art algorithms.

Exploring Richer and More Accurate Information via Frequency Selection for Image Restoration

TL;DR

This work tackles image restoration by bridging spatial and frequency-domain information to better recover degraded images. It introduces MSFSNet, a four-scale encoder-decoder that employs two plug-in modules: Dynamic Filter Selection Module (DFS) to dynamically generate low- and high-frequency maps via learnable filters and a Frequency Cross-Attention Mechanism (FCAM) to select the most informative frequencies, and Skip Feature Fusion Block (SFF) to selectively propagate useful skip-connection information. The network is trained with a joint spatial and Fourier-domain loss, L = L_s + \lambda L_f, where L_s captures pixel-wise differences and L_f enforces fidelity in the frequency domain via the Fourier transform, with \lambda controlling the balance. Across image motion deblurring, defocus deblurring, deraining, and denoising, MSFSNet delivers state-of-the-art or competitive results while achieving notable efficiency, including substantial MACs reduction in deraining and improved PSNR over strong baselines on multiple datasets. The plug-in nature of DFS and SFF enables these frequency-aware enhancements to be readily integrated into existing restoration networks to boost multi-scale feature quality and robustness.

Abstract

Image restoration aims to recover high-quality images from their corrupted counterparts. Many existing methods primarily focus on the spatial domain, neglecting the understanding of frequency variations and ignoring the impact of implicit noise in skip connections. In this paper, we introduce a multi-scale frequency selection network (MSFSNet) that seamlessly integrates spatial and frequency domain knowledge, selectively recovering richer and more accurate information. Specifically, we initially capture spatial features and input them into dynamic filter selection modules (DFS) at different scales to integrate frequency knowledge. DFS utilizes learnable filters to generate high and low-frequency information and employs a frequency cross-attention mechanism (FCAM) to determine the most information to recover. To learn a multi-scale and accurate set of hybrid features, we develop a skip feature fusion block (SFF) that leverages contextual features to discriminatively determine which information should be propagated in skip-connections. It is worth noting that our DFS and SFF are generic plug-in modules that can be directly employed in existing networks without any adjustments, leading to performance improvements. Extensive experiments across various image restoration tasks demonstrate that our MSFSNet achieves performance that is either superior or comparable to state-of-the-art algorithms.
Paper Structure (22 sections, 11 equations, 11 figures, 11 tables)

This paper contains 22 sections, 11 equations, 11 figures, 11 tables.

Figures (11)

  • Figure 1: Computational cost vs. PSNR of models on Image Deblurring (Left) and Image Deraining (Right) tasks. Our MSFSNet achieve the SOTA performance with up to 69.7% of cost reduction on Image Deraining. MACs are computed on patch size of 256 × 256.
  • Figure 2: (a) Overall architecture of MSFSNet. (b) Shallow feature extraction block (SFE) for capturing shallow features in low-resolution input images. (c) Multi-scale frequency selection block (MSFS) integrating spatial and frequency domain knowledge, and select the most informative component for recovery. (d) Skip feature fusion block (SFF) for discriminative information propagation in skip-connections.
  • Figure 3: The structure of NAFBlock chen2022simple.
  • Figure 4: (a) The structure of dynamic filter selection modules (DFS). (b) Frequency cross-attention mechanism (FCAM) to discriminate which low-frequency and high-frequency information should be retained.
  • Figure 5: Image motion deblurring comparisons on the GoPro dataset Gopro. Compared to the state-of-the-art methods, our MSFSNet excels in restoring sharper and perceptually faithful images.
  • ...and 6 more figures