Table of Contents
Fetching ...

Physics-Informed Untrained Learning for RGB-Guided Superresolution Single-Pixel Hyperspectral Imaging

Hao Zhang, Bilige Xu, Lichen Wei, Xu Ma, Wenyi Ren

Abstract

Single-pixel imaging (SPI) offers a cost-effective route to hyperspectral acquisition but struggles to recover high-fidelity spatial and spectral details under extremely low sampling rates, a severely ill-posed inverse problem. While deep learning has shown potential, existing data-driven methods demand large-scale pretraining datasets that are often impractical in hyperspectral imaging. To overcome this limitation, we propose an end-to-end physics-informed framework that leverages untrained neural networks and RGB guidance for joint hyperspectral reconstruction and super-resolution without any external training data. The framework comprises three physically grounded stages: (1) a Regularized Least-Squares method with RGB-derived Grayscale Priors (LS-RGP) that initializes the solution by exploiting cross-modal structural correlations; (2) an Untrained Hyperspectral Recovery Network (UHRNet) that refines the reconstruction through measurement consistency and hybrid regularization; and (3) a Transformer-based Untrained Super-Resolution Network (USRNet) that upsamples the spatial resolution via cross-modal attention, transferring high-frequency details from the RGB guide. Extensive experiments on benchmark datasets demonstrate that our approach significantly surpasses state-of-the-art algorithms in both reconstruction accuracy and spectral fidelity. Moreover, a proof-of-concept experiment using a physical single-pixel imaging system validates the framework's practical applicability, successfully reconstructing a 144-band hyperspectral data cube at a mere 6.25% sampling rate. The proposed method thus provides a robust, data-efficient solution for computational hyperspectral imaging.

Physics-Informed Untrained Learning for RGB-Guided Superresolution Single-Pixel Hyperspectral Imaging

Abstract

Single-pixel imaging (SPI) offers a cost-effective route to hyperspectral acquisition but struggles to recover high-fidelity spatial and spectral details under extremely low sampling rates, a severely ill-posed inverse problem. While deep learning has shown potential, existing data-driven methods demand large-scale pretraining datasets that are often impractical in hyperspectral imaging. To overcome this limitation, we propose an end-to-end physics-informed framework that leverages untrained neural networks and RGB guidance for joint hyperspectral reconstruction and super-resolution without any external training data. The framework comprises three physically grounded stages: (1) a Regularized Least-Squares method with RGB-derived Grayscale Priors (LS-RGP) that initializes the solution by exploiting cross-modal structural correlations; (2) an Untrained Hyperspectral Recovery Network (UHRNet) that refines the reconstruction through measurement consistency and hybrid regularization; and (3) a Transformer-based Untrained Super-Resolution Network (USRNet) that upsamples the spatial resolution via cross-modal attention, transferring high-frequency details from the RGB guide. Extensive experiments on benchmark datasets demonstrate that our approach significantly surpasses state-of-the-art algorithms in both reconstruction accuracy and spectral fidelity. Moreover, a proof-of-concept experiment using a physical single-pixel imaging system validates the framework's practical applicability, successfully reconstructing a 144-band hyperspectral data cube at a mere 6.25% sampling rate. The proposed method thus provides a robust, data-efficient solution for computational hyperspectral imaging.

Paper Structure

This paper contains 32 sections, 13 equations, 13 figures, 5 tables.

Figures (13)

  • Figure 1: The overall architecture of the proposed RGB-guided hyperspectral reconstruction framework. (a) End-to-end pipeline integrating SPI physics, RGB guidance, and untrained neural networks. (b) UHRNet: RGB-guided hyperspectral recovery network. (c) USRNet: transformer-based hyperspectral super-resolution network. (d) Head module for feature mapping. (e) Encoder with multi-head attention. (f) SEBlock for channel-wise attention. (g) ConvBlock with convolution and normalization layers.
  • Figure 2: Comparison of hyperspectral reconstruction quality across different methods for Bands 1, 16, and 21. Grayscale heatmaps visualize spatial fidelity. From left to right: Ground Truth (GT), DGI, GISC, TVAL3, GIDC, PYFINETUNE, MST++, and Ours. PSNR and SSIM values are displayed. Our method consistently achieves higher metrics and superior visual quality.
  • Figure 3: Performance under different SNR conditions at 6.25% sampling rate. (a) Average PSNR, (b) average SSIM, and (c) average SAM for each method. Our approach shows significantly better noise resilience.
  • Figure 4: Overall performance comparison in terms of PSNR and SAM. Our method achieves the best trade-off, indicating superior reconstruction fidelity and spectral preservation.
  • Figure 5: Effect of the number of measurement patterns on performance metrics (PSNR, SSIM, SAM) for our method and PYFINETUNE.
  • ...and 8 more figures