Theoretical and Practical Progress in Hyperspectral Pixel Unmixing with Large Spectral Libraries from a Sparse Perspective
Jade Preston, William Basener
TL;DR
The paper tackles hyperspectral unmixing when large spectral libraries are available, a setting that makes ordinary least squares unstable due to non-invertible library matrices. It systematically evaluates several regression-based unmixing approaches—OLS, NNLS, ridge, lasso, step-wise regression, and Bayesian model averaging—on a Cuprite AVIRIS dataset with a USGS spectral library, using RMSE, model size, non-negativity, and mineral detection as metrics. A central contribution is the insight that priors aligned with the phenomenology of hyperspectral imagery outperform priors optimized for standard OLS predictions, with lasso regression emerging as the best-performing method for achieving sparse, non-negative abundances and fast detection, while NNLS offers the fastest runtimes and BMA variants provide competitive accuracy at higher compute costs. The findings guide practical choices for large-library unmixing and highlight the value of sparsity-inducing regularization, suggesting future work to combine sparse methods with Bayesian ensembles for enhanced performance.
Abstract
Hyperspectral unmixing is the process of determining the presence of individual materials and their respective abundances from an observed pixel spectrum. Unmixing is a fundamental process in hyperspectral image analysis, and is growing in importance as increasingly large spectral libraries are created and used. Unmixing is typically done with ordinary least squares (OLS) regression. However, unmixing with large spectral libraries where the materials present in a pixel are not a priori known, solving for the coefficients in OLS requires inverting a non-invertible matrix from a large spectral library. A number of regression methods are available that can produce a numerical solution using regularization, but with considerably varied effectiveness. Also, simple methods that are unpopular in the statistics literature (i.e. step-wise regression) are used with some level of effectiveness in hyperspectral analysis. In this paper, we provide a thorough performance evaluation of the methods considered, evaluating methods based on how often they select the correct materials in the models. Investigated methods include ordinary least squares regression, non-negative least squares regression, ridge regression, lasso regression, step-wise regression and Bayesian model averaging. We evaluated these unmixing approaches using multiple criteria: incorporation of non-negative abundances, model size, accurate mineral detection and root mean squared error (RMSE). We provide a taxonomy of the regression methods, showing that most methods can be understood as Bayesian methods with specific priors. We conclude that methods that can be derived with priors that correspond to the phenomenology of hyperspectral imagery outperform those with priors that are optimal for prediction performance under the assumptions of ordinary least squares linear regression.
