Table of Contents
Fetching ...

From Image- to Pixel-level: Label-efficient Hyperspectral Image Reconstruction

Yihong Leng, Jiaojiao Li, Haitao Xu, Rui Song

TL;DR

This work tackles the problem of label-efficient hyperspectral image reconstruction by introducing Pixel-SSR, which reconstructs HSIs from RGB images and sparse point spectra. It uses a Gamma-distribution-based strategy to synthesize point spectra in scenes lacking measurements and a Dynamic Prompt Mamba (DyPro-Mamba) that leverages a three-branch Dynamic Receptive Prompt Neck (DRPN) and PromptSSM to fuse Spa-FRFT, Spa-HF, and spectral cues. The method achieves competitive reconstruction accuracy with very small label requirements and demonstrates robustness across unsupervised and image-level supervised settings, validating its potential as a universal paradigm for HSI reconstruction. The combination of Gamma-modeled spectra and dynamic prompting enables effective exploitation of both spatial and spectral priors, with implications for RGB-based downstream tasks beyond HSI reconstruction.

Abstract

Current hyperspectral image (HSI) reconstruction methods primarily rely on image-level approaches, which are time-consuming to form abundant high-quality HSIs through imagers. In contrast, spectrometers offer a more efficient alternative by capturing high-fidelity point spectra, enabling pixel-level HSI reconstruction that balances accuracy and label efficiency. To this end, we introduce a pixel-level spectral super-resolution (Pixel-SSR) paradigm that reconstructs HSI from RGB and point spectra. Despite its advantages, Pixel-SSR presents two key challenges: 1) generalizability to novel scenes lacking point spectra, and 2) effective information extraction to promote reconstruction accuracy. To address the first challenge, a Gamma-modeled strategy is investigated to synthesize point spectra based on their intrinsic properties, including nonnegativity, a skewed distribution, and a positive correlation. Furthermore, complementary three-branch prompts from RGB and point spectra are extracted with a Dynamic Prompt Mamba (DyPro-Mamba), which progressively directs the reconstruction with global spatial distributions, edge details, and spectral dependency. Comprehensive evaluations, including horizontal comparisons with leading methods and vertical assessments across unsupervised and image-level supervised paradigms, demonstrate that ours achieves competitive reconstruction accuracy with efficient label consumption.

From Image- to Pixel-level: Label-efficient Hyperspectral Image Reconstruction

TL;DR

This work tackles the problem of label-efficient hyperspectral image reconstruction by introducing Pixel-SSR, which reconstructs HSIs from RGB images and sparse point spectra. It uses a Gamma-distribution-based strategy to synthesize point spectra in scenes lacking measurements and a Dynamic Prompt Mamba (DyPro-Mamba) that leverages a three-branch Dynamic Receptive Prompt Neck (DRPN) and PromptSSM to fuse Spa-FRFT, Spa-HF, and spectral cues. The method achieves competitive reconstruction accuracy with very small label requirements and demonstrates robustness across unsupervised and image-level supervised settings, validating its potential as a universal paradigm for HSI reconstruction. The combination of Gamma-modeled spectra and dynamic prompting enables effective exploitation of both spatial and spectral priors, with implications for RGB-based downstream tasks beyond HSI reconstruction.

Abstract

Current hyperspectral image (HSI) reconstruction methods primarily rely on image-level approaches, which are time-consuming to form abundant high-quality HSIs through imagers. In contrast, spectrometers offer a more efficient alternative by capturing high-fidelity point spectra, enabling pixel-level HSI reconstruction that balances accuracy and label efficiency. To this end, we introduce a pixel-level spectral super-resolution (Pixel-SSR) paradigm that reconstructs HSI from RGB and point spectra. Despite its advantages, Pixel-SSR presents two key challenges: 1) generalizability to novel scenes lacking point spectra, and 2) effective information extraction to promote reconstruction accuracy. To address the first challenge, a Gamma-modeled strategy is investigated to synthesize point spectra based on their intrinsic properties, including nonnegativity, a skewed distribution, and a positive correlation. Furthermore, complementary three-branch prompts from RGB and point spectra are extracted with a Dynamic Prompt Mamba (DyPro-Mamba), which progressively directs the reconstruction with global spatial distributions, edge details, and spectral dependency. Comprehensive evaluations, including horizontal comparisons with leading methods and vertical assessments across unsupervised and image-level supervised paradigms, demonstrate that ours achieves competitive reconstruction accuracy with efficient label consumption.

Paper Structure

This paper contains 23 sections, 12 equations, 8 figures, 13 tables.

Figures (8)

  • Figure 1: Reconstruction Performance Comparisons. Blue, Orange, and Green denote SOTA methods in unsupervised, our proposed Pixel-SSR, and image-level supervised modes. Advances from Blue to Orange demonstrate that our universal Pixel-SSR significantly enhances the precision with efficient 0.01% labels. Ours, marked as a red star, achieves the best results among all methods in Pixel-SSR mode (Orange) and comparable results to those in image-level supervised mode (Green). Marker sizes reflect the methods' parameter counts.
  • Figure 2: Visualizations. The spectral curves (a) and the histogram of the groundtruth HSI (b) both demonstrate skewness. The reconstructed HSI via Gamma-modeled (c) achieves a more precise distribution than via Gaussian-modeled (d).
  • Figure 3: Overview of our Pixel-SSR paradigm. At its core, the Dynamic Receptive Prompt Neck (DRPN) is designed to represent Spa-FRFT Prompt, Spa-HF Prompt, and Spectral Prompt, respectively denoted as spatial-wise sequences within the Fractional Fourier Transform (FRFT) domain, high-frequency representation, and spectral-wise dependency. These multi-type features undergo our Dynamic Prompt Mamba (DyPro-Mamba) with several PromptSSMs to delineate high-quality HSIs. When employed as a universal Pixel-SSR paradigm, the dotted box in (a) can be replaced with various leading methods, sharing the same DRPN and loss constraints with Ours.
  • Figure 4: Visulaization of multi-modality features from DRPN. Specifically, we visualize the frequency components of Spa-FRFT Prompt $\mathcal{P}_{spa}$ and Spa-HF Prompt $\mathcal{P}_{hf}$. It indicates that $\mathcal{P}_{spa}$ focuses on the center, representing the low-frequency global spatial-wise content; while $\mathcal{P}_{hf}$ on the periphery, denoting the high-frequency edge details. Besides, Spectral Prompt $\mathcal{P}_{spe} \in R^{C \times C}$ denotes intrinsic spectral-wise dependency.
  • Figure 5: Comparisons of reconstructed spectral curves among Gamma- and Guassian-modeled strategies. Three representative points are selected to analyze the accuracy in background, fake lemons, and real lemons, indicating that Gamma-modeled strategy achieves more precise and high-fidelity results, especially when distinguishing slight gaps between real and fake objects.
  • ...and 3 more figures