Dual Simplex Volume Maximization for Simplex-Structured Matrix Factorization
Maryam Abdolali, Giovanni Barbarino, Nicolas Gillis
TL;DR
This work reframes simplex-structured matrix factorization (SSMF) through a duality lens by maximizing the volume of the polar of the data simplex, rather than minimizing the primal volume. The method, MV-Dual, uses a preprocessing pipeline (translation and dimensionality reduction) and a polar representation to enforce feasibility constraints via $Y^T \Theta \le 1$, with identifiability guaranteed under SSC and extendable to separability via an $\eta$-expanded framework. An optimization scheme based on BSUM and column-wise updates solves a smoothed, tractable surrogate of the polar-volume objective, with a min-max flavor for updating the translation vector $v$. Empirically, MV-Dual achieves competitive or superior endmember recovery and hyperspectral unmixing performance while offering favorable computational efficiency compared to state-of-the-art volume- and facet-based methods, particularly in noisy scenarios. This dual-volume perspective bridges volume minimization and facet identification, extending identifiability guarantees and providing practical benefits for real-data applications.
Abstract
Simplex-structured matrix factorization (SSMF) is a generalization of nonnegative matrix factorization, a fundamental interpretable data analysis model, and has applications in hyperspectral unmixing and topic modeling. To obtain identifiable solutions, a standard approach is to find minimum-volume solutions. By taking advantage of the duality/polarity concept for polytopes, we convert minimum-volume SSMF in the primal space to a maximum-volume problem in the dual space. We first prove the identifiability of this maximum-volume dual problem. Then, we use this dual formulation to provide a novel optimization approach which bridges the gap between two existing families of algorithms for SSMF, namely volume minimization and facet identification. Numerical experiments show that the proposed approach performs favorably compared to the state-of-the-art SSMF algorithms.
