Smoothed Separable Nonnegative Matrix Factorization
Nicolas Nadisic, Nicolas Gillis, Christophe Kervazo
TL;DR
This work strengthens separable NMF by relaxing the pure-pixel assumption to proximate latent points and introducing smoothed variants of two classic algorithms. By aggregating or smoothing data around multiple near-vertex points, SVCA and SSPA achieve improved robustness to noise and enhanced vertex recovery, with SVCA supported by theoretical guarantees analogous to ALLS. Empirically, the smoothed methods outperform VCA, SPA, and ALLS on synthetic data, hyperspectral unmixing, and facial feature extraction, with median aggregation providing robustness to non-Gaussian noise. Overall, the paper demonstrates that smoothed separable NMF is a practically effective and theoretically grounded approach for convex-hull–based vertex estimation in noisy settings.
Abstract
Given a set of data points belonging to the convex hull of a set of vertices, a key problem in linear algebra, signal processing, data analysis and machine learning is to estimate these vertices in the presence of noise. Many algorithms have been developed under the assumption that there is at least one nearby data point to each vertex; two of the most widely used ones are vertex component analysis (VCA) and the successive projection algorithm (SPA). This assumption is known as the pure-pixel assumption in blind hyperspectral unmixing, and as the separability assumption in nonnegative matrix factorization. More recently, Bhattacharyya and Kannan (ACM-SIAM Symposium on Discrete Algorithms, 2020) proposed an algorithm for learning a latent simplex (ALLS) that relies on the assumption that there is more than one nearby data point to each vertex. In that scenario, ALLS is probalistically more robust to noise than algorithms based on the separability assumption. In this paper, inspired by ALLS, we propose smoothed VCA (SVCA) and smoothed SPA (SSPA) that generalize VCA and SPA by assuming the presence of several nearby data points to each vertex. We illustrate the effectiveness of SVCA and SSPA over VCA, SPA and ALLS on synthetic data sets, on the unmixing of hyperspectral images, and on feature extraction on facial images data sets. In addition, our study highlights new theoretical results for VCA.
