Latent Diffusion Prior Enhanced Deep Unfolding for Snapshot Spectral Compressive Imaging
Zongliang Wu, Ruiying Lu, Ying Fu, Xin Yuan
TL;DR
The paper tackles the challenge of reconstructing high-dimensional hyperspectral data from a single snapshot in CASSI, an inherently ill-posed problem. It introduces a latent diffusion prior to guide a physics-based unfolding network, combining two-phase training (prior learning from clean HSIs and diffusion-conditioned prior generation) with a Trident Transformer to fuse degradation-free priors with spatial and spectral information. The approach uses a GC-GAP unfolding framework and a lightweight latent encoder, enabling efficient inference while delivering higher PSNR/SSIM and reduced compute compared to state-of-the-art methods. Experimental results on synthetic and real SD-CASSI data demonstrate superior reconstruction quality and practical efficiency, with ablations validating the efficacy of the LDM priors and the TT design.
Abstract
Snapshot compressive spectral imaging reconstruction aims to reconstruct three-dimensional spatial-spectral images from a single-shot two-dimensional compressed measurement. Existing state-of-the-art methods are mostly based on deep unfolding structures but have intrinsic performance bottlenecks: $i$) the ill-posed problem of dealing with heavily degraded measurement, and $ii$) the regression loss-based reconstruction models being prone to recover images with few details. In this paper, we introduce a generative model, namely the latent diffusion model (LDM), to generate degradation-free prior to enhance the regression-based deep unfolding method. Furthermore, to overcome the large computational cost challenge in LDM, we propose a lightweight model to generate knowledge priors in deep unfolding denoiser, and integrate these priors to guide the reconstruction process for compensating high-quality spectral signal details. Numeric and visual comparisons on synthetic and real-world datasets illustrate the superiority of our proposed method in both reconstruction quality and computational efficiency. Code will be released.
