Self-Supervised One-Step Diffusion Refinement for Snapshot Compressive Imaging
Shaoguang Huang, Yunzhen Wang, Haijin Zeng, Hongyu Chen, Hongyan Zhang
TL;DR
The paper tackles the ill-posed task of reconstructing multispectral images from a single snapshot in snapshot compressive imaging. It introduces a self-supervised One-Step Diffusion Refinement (OSD) framework that couples an existing SCI predictor with a single-step diffusion residual (DiFA), enabling fast, high-fidelity refinements without iterative denoising. A spectral compression distillation strategy transfers RGB diffusion priors to MSI space, while an equivariant imaging consistency loss leverages 2-D measurements alone for robust training and generalization. Across simulations and real CASSI data, the method achieves state-of-the-art PSNR/SSIM and substantial speedups, demonstrating practical viability for real-world SCI reconstruction.
Abstract
Snapshot compressive imaging (SCI) captures multispectral images (MSIs) using a single coded two-dimensional (2-D) measurement, but reconstructing high-fidelity MSIs from these compressed inputs remains a fundamentally ill-posed challenge. While diffusion-based reconstruction methods have recently raised the bar for quality, they face critical limitations: a lack of large-scale MSI training data, adverse domain shifts from RGB-pretrained models, and inference inefficiencies due to multi-step sampling. These drawbacks restrict their practicality in real-world applications. In contrast to existing methods, which either follow costly iterative refinement or adapt subspace-based embeddings for diffusion models (e.g. DiffSCI, PSR-SCI), we introduce a fundamentally different paradigm: a self-supervised One-Step Diffusion (OSD) framework specifically designed for SCI. The key novelty lies in using a single-step diffusion refiner to correct an initial reconstruction, eliminating iterative denoising entirely while preserving generative quality. Moreover, we adopt a self-supervised equivariant learning strategy to train both the predictor and refiner directly from raw 2-D measurements, enabling generalization to unseen domains without the need for ground-truth MSI. To further address the challenge of limited MSI data, we design a band-selection-driven distillation strategy that transfers core generative priors from large-scale RGB datasets, effectively bridging the domain gap. Extensive experiments confirm that our approach sets a new benchmark, yielding PSNR gains of 3.44 dB, 1.61 dB, and 0.28 dB on the Harvard, NTIRE, and ICVL datasets, respectively, while reducing reconstruction time by 97.5%. This remarkable improvement in efficiency and adaptability makes our method a significant advancement in SCI reconstruction, combining both accuracy and practicality for real-world deployment.
