Uformer-ICS: A U-Shaped Transformer for Image Compressive Sensing Service
Kuiyuan Zhang, Zhongyun Hua, Yuanman Li, Yushu Zhang, Yicong Zhou
TL;DR
This work addresses real-time image compressive sensing by integrating CS priors into a U-shaped transformer. It introduces an adaptive sampling mechanism that estimates block sparsity from initial measurements to allocate per-block sampling resources, and a multi-channel projection (MCP) module that injects CS projection knowledge into a projection-based transformer block. The reconstruction network combines a four-level Uformer with MCP and residual convolutions to capture both local and long-range dependencies, achieving state-of-the-art PSNR/SSIM across five datasets and enabling scalable, one-model-for-arbitrary-sampling performance. The results demonstrate significant improvements over existing DL-based CS methods and highlight the practical potential for bandwidth- and storage-efficient image sensing in service-oriented applications.
Abstract
Many service computing applications require real-time dataset collection from multiple devices, necessitating efficient sampling techniques to reduce bandwidth and storage pressure. Compressive sensing (CS) has found wide-ranging applications in image acquisition and reconstruction. Recently, numerous deep-learning methods have been introduced for CS tasks. However, the accurate reconstruction of images from measurements remains a significant challenge, especially at low sampling rates. In this paper, we propose Uformer-ICS as a novel U-shaped transformer for image CS tasks by introducing inner characteristics of CS into transformer architecture. To utilize the uneven sparsity distribution of image blocks, we design an adaptive sampling architecture that allocates measurement resources based on the estimated block sparsity, allowing the compressed results to retain maximum information from the original image. Additionally, we introduce a multi-channel projection (MCP) module inspired by traditional CS optimization methods. By integrating the MCP module into the transformer blocks, we construct projection-based transformer blocks, and then form a symmetrical reconstruction model using these blocks and residual convolutional blocks. Therefore, our reconstruction model can simultaneously utilize the local features and long-range dependencies of image, and the prior projection knowledge of CS theory. Experimental results demonstrate its significantly better reconstruction performance than state-of-the-art deep learning-based CS methods.
