MCGA: Mixture of Codebooks Hyperspectral Reconstruction via Grayscale-Aware Attention
Zhanjiang Yang, Lijun Sun, Jiawei Dong, Xiaoxin An, Yang Liu, Meng Li
TL;DR
This work tackles the ill-posed task of reconstructing hyperspectral images from RGB inputs by introducing MCGA, a two-stage framework that first learns a transferable mixture of codebooks (MoC) as spectral priors via a multi-scale VQ-VAE, then aligns RGB features to these priors with a Grayscale-Aware Network (GANet) that uses top-K quantized self-attention for efficiency. Grayscale-aware operations and test-time adaptation improve photometric consistency and robustness under illumination and distribution shifts, while achieving 4–5× faster inference than prior methods. Empirical results on HySpecNet-11k and ARAD-1k show state-of-the-art accuracy and strong cross-dataset generalization, with notable robustness to spatial OOD and illumination perturbations. The work highlights the potential of MoC priors for cross-dataset spectral reconstruction and suggests extensions to synthetic HSI generation and other low-quality image restoration tasks.
Abstract
Reconstructing hyperspectral images (HSIs) from RGB inputs provides a cost-effective alternative to hyperspectral cameras, but reconstructing high-dimensional spectra from three channels is inherently ill-posed. Existing methods typically directly regress RGB-to-HSI mappings using large attention networks, which are computationally expensive and handle ill-posedness only implicitly. We propose MCGA, a Mixture-of-Codebooks with Grayscale-aware Attention framework that explicitly addresses these challenges using spectral priors and photometric consistency. MCGA first learns transferable spectral priors via a mixture-of-codebooks (MoC) from heterogeneous HSI datasets, then aligns RGB features with these priors through grayscale-aware photometric attention (GANet). Efficiency and robustness are further improved via top-K attention design and test-time adaptation (TTA). Experiments on multiple real-world benchmarks demonstrate the state-of-the-art accuracy, strong cross-dataset generalization, and 4-5x faster inference. Codes will be available once acceptance at https://github.com/Fibonaccirabbit/MCGA.
