SpACNN-LDVAE: Spatial Attention Convolutional Latent Dirichlet Variational Autoencoder for Hyperspectral Pixel Unmixing
Soham Chitnis, Kiran Mantripragada, Faisal Z. Qureshi
TL;DR
SpACNN-LDVAE advances hyperspectral pixel unmixing by incorporating local spatial context through a Spatial Attention CNN Encoder that yields a Dirichlet latent for abundances, coupled with a Multivariate Normal spectral decoder. The model extends LDVAE by exploiting spatial coherence, enforcing ASC and ANC via a softmax-based Dirichlet parameterization, and optimizing with an ELBO objective that includes a reconstruction term for abundances. Empirical results across Samson, HYDICE Urban, Cuprite, and OnTech-Syn-HSI-21 show improved endmember extraction and abundance estimation over the MLP-LDVAE baseline, with transfer learning from synthetic Cuprite data enabling real-world inference. The approach demonstrates the practical value of spatially aware unmixing in hyperspectral imaging and supports generating spectra from abundances, potentially aiding material identification in remote sensing applications.
Abstract
The hyperspectral pixel unmixing aims to find the underlying materials (endmembers) and their proportions (abundances) in pixels of a hyperspectral image. This work extends the Latent Dirichlet Variational Autoencoder (LDVAE) pixel unmixing scheme by taking into account local spatial context while performing pixel unmixing. The proposed method uses an isotropic convolutional neural network with spatial attention to encode pixels as a dirichlet distribution over endmembers. We have evaluated our model on Samson, Hydice Urban, Cuprite, and OnTech-HSI-Syn-21 datasets. Our model also leverages the transfer learning paradigm for Cuprite Dataset, where we train the model on synthetic data and evaluate it on the real-world data. The results suggest that incorporating spatial context improves both endmember extraction and abundance estimation.
