Table of Contents
Fetching ...

Checking the Sufficiently Scattered Condition using a Global Non-Convex Optimization Software

Nicolas Gillis, Robert Luce

TL;DR

The paper addresses identifiability in matrix factorization by focusing on the sufficiently scattered condition ($SSC$). It reformulates the SSC check as a bounded non-convex quadratic program and solves it with a global solver (Gurobi), augmented by a polynomial-time necessary condition (NC-SSC) and McCormick envelope relaxations. The authors prove a tightening result that reduces the test to a bounded problem whose optimum equals $1$ if and only if $SSC$ holds, and demonstrate practical feasibility on synthetic data and real hyperspectral images, including minimum-volume NMF. This provides a practical, exact post hoc certificate of identifiability for a broad class of constrained factorizations, with meaningful impact in domains like hyperspectral imaging.

Abstract

The sufficiently scattered condition (SSC) is a key condition in the study of identifiability of various matrix factorization problems, including nonnegative, minimum-volume, symmetric, simplex-structured, and polytopic matrix factorizations. The SSC allows one to guarantee that the computed matrix factorization is unique/identifiable, up to trivial ambiguities. However, this condition is NP-hard to check in general. In this paper, we show that it can however be checked in a reasonable amount of time in realistic scenarios, when the factorization rank is not too large. This is achieved by formulating the problem as a non-convex quadratic optimization problem over a bounded set. We use the global non-convex optimization software Gurobi, and showcase the usefulness of this code on synthetic data sets and on real-world hyperspectral images.

Checking the Sufficiently Scattered Condition using a Global Non-Convex Optimization Software

TL;DR

The paper addresses identifiability in matrix factorization by focusing on the sufficiently scattered condition (). It reformulates the SSC check as a bounded non-convex quadratic program and solves it with a global solver (Gurobi), augmented by a polynomial-time necessary condition (NC-SSC) and McCormick envelope relaxations. The authors prove a tightening result that reduces the test to a bounded problem whose optimum equals if and only if holds, and demonstrate practical feasibility on synthetic data and real hyperspectral images, including minimum-volume NMF. This provides a practical, exact post hoc certificate of identifiability for a broad class of constrained factorizations, with meaningful impact in domains like hyperspectral imaging.

Abstract

The sufficiently scattered condition (SSC) is a key condition in the study of identifiability of various matrix factorization problems, including nonnegative, minimum-volume, symmetric, simplex-structured, and polytopic matrix factorizations. The SSC allows one to guarantee that the computed matrix factorization is unique/identifiable, up to trivial ambiguities. However, this condition is NP-hard to check in general. In this paper, we show that it can however be checked in a reasonable amount of time in realistic scenarios, when the factorization rank is not too large. This is achieved by formulating the problem as a non-convex quadratic optimization problem over a bounded set. We use the global non-convex optimization software Gurobi, and showcase the usefulness of this code on synthetic data sets and on real-world hyperspectral images.
Paper Structure (16 sections, 6 theorems, 17 equations, 3 figures, 3 tables, 1 algorithm)

This paper contains 16 sections, 6 theorems, 17 equations, 3 figures, 3 tables, 1 algorithm.

Key Result

Theorem 1

Let $X = W H$ be an NMF of $X$ with factorization rank $r$, where ${W}^\top$ and $H$ satisfy the SSC. Then this NMF is unique, that is, any other NMF of $X$ with factorization rank $r$, $X = W'H'$ with $W' \geq 0$ and $H' \geq 0$, can be obtained by permutation and scaling of $WH$; see eq:permscal.

Figures (3)

  • Figure 1: Illustration of the McCormick envelope for the nonlinear constraint $xy = 6$, with $x \in [2,6]$ and $y \in [1,3]$. (Note that, in this example, the last two inequalities, defining the upper bound, coincide.)
  • Figure 2: Number of times, over 20 trials, the SSC for $r$-by-$n$ matrices whose columns are $k$-sparse satisfied the SSC.
  • Figure 3: Abundance maps computed via minimum-volume NMF \ref{['eq:minvolNMF']}. Each image corresponds to the abundance map of a material which is a row of $H$ reshaped as an image.

Theorems & Definitions (13)

  • Definition 1: SSC, huang2013non
  • Theorem 1: huang2013non
  • Definition 2: Necessary Condition for the SSC, NC-SSC
  • Lemma 1
  • proof
  • Corollary 1
  • proof
  • Lemma 2
  • proof
  • Lemma 3
  • ...and 3 more