Table of Contents
Fetching ...

WTDUN: Wavelet Tree-Structured Sampling and Deep Unfolding Network for Image Compressed Sensing

Kai Han, Jin Wang, Yunhui Shi, Hanqin Cai, Nam Ling, Baocai Yin

TL;DR

In image compressed sensing, naive sampling and uniform reconstruction often fail to preserve fine and structural details at low sampling rates. WTDUN introduces a wavelet-domain deep unfolding framework that jointly performs adaptive, subband-aware sampling and tree-structured reconstruction across multiple wavelet scales, guided by a quad-tree prior and deblocking at each stage. The method integrates a wavelet-domain adaptive sampling (WAS) strategy with a wavelet-tree-structure prior (WTP) and a cascade of reconstruction stages that exploit inter-scale dependencies, augmented by cross-domain attention and texture-aware losses. Empirical results across BSD500-derived training and multiple test datasets show consistent PSNR/SSIM improvements over state-of-the-art CS methods, with qualitative gains in texture and edge fidelity and a favorable accuracy–speed trade-off, signaling practical impact for real-time, high-quality image CS in resource-constrained settings.

Abstract

Deep unfolding networks have gained increasing attention in the field of compressed sensing (CS) owing to their theoretical interpretability and superior reconstruction performance. However, most existing deep unfolding methods often face the following issues: 1) they learn directly from single-channel images, leading to a simple feature representation that does not fully capture complex features; and 2) they treat various image components uniformly, ignoring the characteristics of different components. To address these issues, we propose a novel wavelet-domain deep unfolding framework named WTDUN, which operates directly on the multi-scale wavelet subbands. Our method utilizes the intrinsic sparsity and multi-scale structure of wavelet coefficients to achieve a tree-structured sampling and reconstruction, effectively capturing and highlighting the most important features within images. Specifically, the design of tree-structured reconstruction aims to capture the inter-dependencies among the multi-scale subbands, enabling the identification of both fine and coarse features, which can lead to a marked improvement in reconstruction quality. Furthermore, a wavelet domain adaptive sampling method is proposed to greatly improve the sampling capability, which is realized by assigning measurements to each wavelet subband based on its importance. Unlike pure deep learning methods that treat all components uniformly, our method introduces a targeted focus on important subbands, considering their energy and sparsity. This targeted strategy lets us capture key information more efficiently while discarding less important information, resulting in a more effective and detailed reconstruction. Extensive experimental results on various datasets validate the superior performance of our proposed method.

WTDUN: Wavelet Tree-Structured Sampling and Deep Unfolding Network for Image Compressed Sensing

TL;DR

In image compressed sensing, naive sampling and uniform reconstruction often fail to preserve fine and structural details at low sampling rates. WTDUN introduces a wavelet-domain deep unfolding framework that jointly performs adaptive, subband-aware sampling and tree-structured reconstruction across multiple wavelet scales, guided by a quad-tree prior and deblocking at each stage. The method integrates a wavelet-domain adaptive sampling (WAS) strategy with a wavelet-tree-structure prior (WTP) and a cascade of reconstruction stages that exploit inter-scale dependencies, augmented by cross-domain attention and texture-aware losses. Empirical results across BSD500-derived training and multiple test datasets show consistent PSNR/SSIM improvements over state-of-the-art CS methods, with qualitative gains in texture and edge fidelity and a favorable accuracy–speed trade-off, signaling practical impact for real-time, high-quality image CS in resource-constrained settings.

Abstract

Deep unfolding networks have gained increasing attention in the field of compressed sensing (CS) owing to their theoretical interpretability and superior reconstruction performance. However, most existing deep unfolding methods often face the following issues: 1) they learn directly from single-channel images, leading to a simple feature representation that does not fully capture complex features; and 2) they treat various image components uniformly, ignoring the characteristics of different components. To address these issues, we propose a novel wavelet-domain deep unfolding framework named WTDUN, which operates directly on the multi-scale wavelet subbands. Our method utilizes the intrinsic sparsity and multi-scale structure of wavelet coefficients to achieve a tree-structured sampling and reconstruction, effectively capturing and highlighting the most important features within images. Specifically, the design of tree-structured reconstruction aims to capture the inter-dependencies among the multi-scale subbands, enabling the identification of both fine and coarse features, which can lead to a marked improvement in reconstruction quality. Furthermore, a wavelet domain adaptive sampling method is proposed to greatly improve the sampling capability, which is realized by assigning measurements to each wavelet subband based on its importance. Unlike pure deep learning methods that treat all components uniformly, our method introduces a targeted focus on important subbands, considering their energy and sparsity. This targeted strategy lets us capture key information more efficiently while discarding less important information, resulting in a more effective and detailed reconstruction. Extensive experimental results on various datasets validate the superior performance of our proposed method.

Paper Structure

This paper contains 33 sections, 20 equations, 8 figures, 11 tables.

Figures (8)

  • Figure 1: Illustration of WTDUN, which consists of a sampling module, an initial module, and $K$ reconstruction modules.
  • Figure 2: The detailed design of one single phase in WTDUN. The GDM(gradient descent module) represents Eq. \ref{['gdm_r']}. STF-1 and STF-2 denote Eq. \ref{['relu_z']} and \ref{['relu_theta']} respectively.
  • Figure 3: Detailed design of each module. (1) indicates the sampling process. (2) is the initial reconstruction process. (3) is the deblock module, which is composed of six convolution layers, with a ReLU activation layer between the adjacent convolution layers. (4) is the denoise module, which is composed of four convolution layers, with a ReLU activation layer between the adjacent convolution layers. (5) is the cross-subband fusion module (CAC).
  • Figure 4: Visual quality comparisons between our proposed method and recently state-of-the-art CS methods on Set5 at 10% CS ratio. The best and second-best results are highlighted in bold and italics, respectively.
  • Figure 5: Visual quality comparisons between our WTDUN and recently state-of-the-art CS methods on Set11 at 10% CS ratio. The best and second-best results are highlighted in bold and italics, respectively.
  • ...and 3 more figures