Table of Contents
Fetching ...

MCGA: Mixture of Codebooks Hyperspectral Reconstruction via Grayscale-Aware Attention

Zhanjiang Yang, Lijun Sun, Jiawei Dong, Xiaoxin An, Yang Liu, Meng Li

TL;DR

This work tackles the ill-posed task of reconstructing hyperspectral images from RGB inputs by introducing MCGA, a two-stage framework that first learns a transferable mixture of codebooks (MoC) as spectral priors via a multi-scale VQ-VAE, then aligns RGB features to these priors with a Grayscale-Aware Network (GANet) that uses top-K quantized self-attention for efficiency. Grayscale-aware operations and test-time adaptation improve photometric consistency and robustness under illumination and distribution shifts, while achieving 4–5× faster inference than prior methods. Empirical results on HySpecNet-11k and ARAD-1k show state-of-the-art accuracy and strong cross-dataset generalization, with notable robustness to spatial OOD and illumination perturbations. The work highlights the potential of MoC priors for cross-dataset spectral reconstruction and suggests extensions to synthetic HSI generation and other low-quality image restoration tasks.

Abstract

Reconstructing hyperspectral images (HSIs) from RGB inputs provides a cost-effective alternative to hyperspectral cameras, but reconstructing high-dimensional spectra from three channels is inherently ill-posed. Existing methods typically directly regress RGB-to-HSI mappings using large attention networks, which are computationally expensive and handle ill-posedness only implicitly. We propose MCGA, a Mixture-of-Codebooks with Grayscale-aware Attention framework that explicitly addresses these challenges using spectral priors and photometric consistency. MCGA first learns transferable spectral priors via a mixture-of-codebooks (MoC) from heterogeneous HSI datasets, then aligns RGB features with these priors through grayscale-aware photometric attention (GANet). Efficiency and robustness are further improved via top-K attention design and test-time adaptation (TTA). Experiments on multiple real-world benchmarks demonstrate the state-of-the-art accuracy, strong cross-dataset generalization, and 4-5x faster inference. Codes will be available once acceptance at https://github.com/Fibonaccirabbit/MCGA.

MCGA: Mixture of Codebooks Hyperspectral Reconstruction via Grayscale-Aware Attention

TL;DR

This work tackles the ill-posed task of reconstructing hyperspectral images from RGB inputs by introducing MCGA, a two-stage framework that first learns a transferable mixture of codebooks (MoC) as spectral priors via a multi-scale VQ-VAE, then aligns RGB features to these priors with a Grayscale-Aware Network (GANet) that uses top-K quantized self-attention for efficiency. Grayscale-aware operations and test-time adaptation improve photometric consistency and robustness under illumination and distribution shifts, while achieving 4–5× faster inference than prior methods. Empirical results on HySpecNet-11k and ARAD-1k show state-of-the-art accuracy and strong cross-dataset generalization, with notable robustness to spatial OOD and illumination perturbations. The work highlights the potential of MoC priors for cross-dataset spectral reconstruction and suggests extensions to synthetic HSI generation and other low-quality image restoration tasks.

Abstract

Reconstructing hyperspectral images (HSIs) from RGB inputs provides a cost-effective alternative to hyperspectral cameras, but reconstructing high-dimensional spectra from three channels is inherently ill-posed. Existing methods typically directly regress RGB-to-HSI mappings using large attention networks, which are computationally expensive and handle ill-posedness only implicitly. We propose MCGA, a Mixture-of-Codebooks with Grayscale-aware Attention framework that explicitly addresses these challenges using spectral priors and photometric consistency. MCGA first learns transferable spectral priors via a mixture-of-codebooks (MoC) from heterogeneous HSI datasets, then aligns RGB features with these priors through grayscale-aware photometric attention (GANet). Efficiency and robustness are further improved via top-K attention design and test-time adaptation (TTA). Experiments on multiple real-world benchmarks demonstrate the state-of-the-art accuracy, strong cross-dataset generalization, and 4-5x faster inference. Codes will be available once acceptance at https://github.com/Fibonaccirabbit/MCGA.

Paper Structure

This paper contains 16 sections, 1 equation, 2 figures, 6 tables, 2 algorithms.

Figures (2)

  • Figure 1: The proposed MCGA is a Mixture-of-Codebooks framework with Grayscale-aware Attention, leveraging spectral priors and grayscale photometric consistency.
  • Figure 2: A case study on ARAD-1k: the bottom row shows ground truths for each channel, indicated by the numbers.