Table of Contents
Fetching ...

Dual Simplex Volume Maximization for Simplex-Structured Matrix Factorization

Maryam Abdolali, Giovanni Barbarino, Nicolas Gillis

TL;DR

This work reframes simplex-structured matrix factorization (SSMF) through a duality lens by maximizing the volume of the polar of the data simplex, rather than minimizing the primal volume. The method, MV-Dual, uses a preprocessing pipeline (translation and dimensionality reduction) and a polar representation to enforce feasibility constraints via $Y^T \Theta \le 1$, with identifiability guaranteed under SSC and extendable to separability via an $\eta$-expanded framework. An optimization scheme based on BSUM and column-wise updates solves a smoothed, tractable surrogate of the polar-volume objective, with a min-max flavor for updating the translation vector $v$. Empirically, MV-Dual achieves competitive or superior endmember recovery and hyperspectral unmixing performance while offering favorable computational efficiency compared to state-of-the-art volume- and facet-based methods, particularly in noisy scenarios. This dual-volume perspective bridges volume minimization and facet identification, extending identifiability guarantees and providing practical benefits for real-data applications.

Abstract

Simplex-structured matrix factorization (SSMF) is a generalization of nonnegative matrix factorization, a fundamental interpretable data analysis model, and has applications in hyperspectral unmixing and topic modeling. To obtain identifiable solutions, a standard approach is to find minimum-volume solutions. By taking advantage of the duality/polarity concept for polytopes, we convert minimum-volume SSMF in the primal space to a maximum-volume problem in the dual space. We first prove the identifiability of this maximum-volume dual problem. Then, we use this dual formulation to provide a novel optimization approach which bridges the gap between two existing families of algorithms for SSMF, namely volume minimization and facet identification. Numerical experiments show that the proposed approach performs favorably compared to the state-of-the-art SSMF algorithms.

Dual Simplex Volume Maximization for Simplex-Structured Matrix Factorization

TL;DR

This work reframes simplex-structured matrix factorization (SSMF) through a duality lens by maximizing the volume of the polar of the data simplex, rather than minimizing the primal volume. The method, MV-Dual, uses a preprocessing pipeline (translation and dimensionality reduction) and a polar representation to enforce feasibility constraints via , with identifiability guaranteed under SSC and extendable to separability via an -expanded framework. An optimization scheme based on BSUM and column-wise updates solves a smoothed, tractable surrogate of the polar-volume objective, with a min-max flavor for updating the translation vector . Empirically, MV-Dual achieves competitive or superior endmember recovery and hyperspectral unmixing performance while offering favorable computational efficiency compared to state-of-the-art volume- and facet-based methods, particularly in noisy scenarios. This dual-volume perspective bridges volume minimization and facet identification, extending identifiability guarantees and providing practical benefits for real-data applications.

Abstract

Simplex-structured matrix factorization (SSMF) is a generalization of nonnegative matrix factorization, a fundamental interpretable data analysis model, and has applications in hyperspectral unmixing and topic modeling. To obtain identifiable solutions, a standard approach is to find minimum-volume solutions. By taking advantage of the duality/polarity concept for polytopes, we convert minimum-volume SSMF in the primal space to a maximum-volume problem in the dual space. We first prove the identifiability of this maximum-volume dual problem. Then, we use this dual formulation to provide a novel optimization approach which bridges the gap between two existing families of algorithms for SSMF, namely volume minimization and facet identification. Numerical experiments show that the proposed approach performs favorably compared to the state-of-the-art SSMF algorithms.
Paper Structure (40 sections, 8 theorems, 54 equations, 10 figures, 4 tables, 1 algorithm)

This paper contains 40 sections, 8 theorems, 54 equations, 10 figures, 4 tables, 1 algorithm.

Key Result

Theorem 1

Let $Y$ be an $(r-1)\times n$ real matrix with $n\ge r$ such that $Y = PH$ with $P$ an $(r-1)\times r$ full rank real matrix, $Pe = 0$, and $H$ an $r\times n$ SSC and column stochastic matrix. Then is uniquely solved by the polar matrix of $P$.

Figures (10)

  • Figure 1: Comparison of separability (left), SSC (middle), and the facet-based condition (right) for the matrix $H$ whose columns lie on $\Delta^r$ in the case $r = 3$. On the left, separability requires the columns of $H$ to contain the unit vectors, that is, $H(:,\mathcal{K}) = I_r$ for some $\mathcal{K}$. On the middle, the SSC requires $\mathcal{C} \subset \mathop{\mathrm{cone}}\nolimits(H)$. On the right, the facet-based condition requires $r=3$ columns of $H$ on each facet of the unit simplex. Figure from abdolali2021simplex.
  • Figure 2: Visual representation in 3 and 2 dimensions for the unit simplex $\Delta^3$, the cone $\mathcal{C}$ intersected with $\Delta^3$, the symmetrized and expanded $\Delta_\mu^3$, the associated $\mathcal{H}_\eta = \Delta^3\cap\Delta_\mu^3$ and a $\eta$-expanded $H$.
  • Figure 3: Average ERR metric and running time (in seconds)vs purity over 10 trials for noiseless data and different values of $r$ and $m$.
  • Figure 4: MV-Dual vs GFPI in the case of low-purity. GT stands for ground truth.
  • Figure 5: Average ERR metric vs purity over 10 trials for noisy data and different values of $r$, $m$ and SNR levels.
  • ...and 5 more figures

Theorems & Definitions (19)

  • Definition 1: SSC
  • Definition 2: Polar
  • Theorem 1
  • proof
  • Theorem 2
  • proof
  • Definition 3
  • Lemma 1
  • Lemma 2
  • Theorem 3
  • ...and 9 more