Table of Contents
Fetching ...

Low-Rank Matrix Factorizations with Volume-based Constraints and Regularizations

Olivier Vu Thanh

TL;DR

This work advances identifiable low-rank matrix factorization by leveraging volume-based constraints and polytopic/bounded geometries, improving interpretability and uniqueness. It introduces bounded simplex-structured (BSSMF) and polytopic (PMF) factorization frameworks, establishes identifiability under sufficiently scattered conditions, and develops efficient algorithms (TITAN, RandSPA, ADMM) for practical use. The thesis demonstrates these methods across hyperspectral unmixing, recommender systems, and missing-data imputation, showing robustness to overfitting and improved recovery. Collectively, the contributions offer a principled, scalable toolkit for interpretable, unique LRMFs with broad applicability to real-world data analysis tasks.

Abstract

Low-rank matrix factorizations are a class of linear models widely used in various fields such as machine learning, signal processing, and data analysis. These models approximate a matrix as the product of two smaller matrices, where the left matrix captures latent features while the right matrix linearly decomposes the data based on these features. There are many ways to define what makes a component "important." Standard LRMFs, such as the truncated singular value decomposition, focus on minimizing the distance between the original matrix and its low-rank approximation. In this thesis, the notion of "importance" is closely linked to interpretability and uniqueness, which are key to obtaining reliable and meaningful results. This thesis thus focuses on volume-based constraints and regularizations designed to enhance interpretability and uniqueness. We first introduce two new volume-constrained LRMFs designed to enhance these properties. The first assumes that data points are naturally bounded (e.g., movie ratings between 1 and 5 stars) and can be explained by convex combinations of features within the same bounds, allowing them to be interpreted in the same way as the data. The second model is more general, constraining the factors to belong to convex polytopes. Then, two variants of volume-regularized LRMFs are proposed. The first minimizes the volume of the latent features, encouraging them to cluster closely together, while the second maximizes the volume of the decompositions, promoting sparse representations. Across all these models, uniqueness is achieved under the core principle that the factors must be "sufficiently scattered" within their respective feasible sets. Motivated by applications such as blind source separation and missing data imputation, this thesis also proposes efficient algorithms that make these models practical for real-world applications.

Low-Rank Matrix Factorizations with Volume-based Constraints and Regularizations

TL;DR

This work advances identifiable low-rank matrix factorization by leveraging volume-based constraints and polytopic/bounded geometries, improving interpretability and uniqueness. It introduces bounded simplex-structured (BSSMF) and polytopic (PMF) factorization frameworks, establishes identifiability under sufficiently scattered conditions, and develops efficient algorithms (TITAN, RandSPA, ADMM) for practical use. The thesis demonstrates these methods across hyperspectral unmixing, recommender systems, and missing-data imputation, showing robustness to overfitting and improved recovery. Collectively, the contributions offer a principled, scalable toolkit for interpretable, unique LRMFs with broad applicability to real-world data analysis tasks.

Abstract

Low-rank matrix factorizations are a class of linear models widely used in various fields such as machine learning, signal processing, and data analysis. These models approximate a matrix as the product of two smaller matrices, where the left matrix captures latent features while the right matrix linearly decomposes the data based on these features. There are many ways to define what makes a component "important." Standard LRMFs, such as the truncated singular value decomposition, focus on minimizing the distance between the original matrix and its low-rank approximation. In this thesis, the notion of "importance" is closely linked to interpretability and uniqueness, which are key to obtaining reliable and meaningful results. This thesis thus focuses on volume-based constraints and regularizations designed to enhance interpretability and uniqueness. We first introduce two new volume-constrained LRMFs designed to enhance these properties. The first assumes that data points are naturally bounded (e.g., movie ratings between 1 and 5 stars) and can be explained by convex combinations of features within the same bounds, allowing them to be interpreted in the same way as the data. The second model is more general, constraining the factors to belong to convex polytopes. Then, two variants of volume-regularized LRMFs are proposed. The first minimizes the volume of the latent features, encouraging them to cluster closely together, while the second maximizes the volume of the decompositions, promoting sparse representations. Across all these models, uniqueness is achieved under the core principle that the factors must be "sufficiently scattered" within their respective feasible sets. Motivated by applications such as blind source separation and missing data imputation, this thesis also proposes efficient algorithms that make these models practical for real-world applications.

Paper Structure

This paper contains 92 sections, 17 theorems, 120 equations, 32 figures, 14 tables, 7 algorithms.

Key Result

Theorem 2.1

huang2013non If $W^\top \in \mathbb{R}^{r \times m}$ and $H \in \mathbb{R}^{r \times n}$ are sufficiently scattered, then the Exact NMF $(W,H)$ of $X=WH$ of size $r = \mathop{\mathrm{rank}}\nolimits(X)$ is essentially unique.

Figures (32)

  • Figure 1: Geometric interpretation of Exact NMF with $r=3$
  • Figure 2: Geometric interpretation of Exact SSMF with $r=4$ and $n=6$
  • Figure 3: Illustration of the SSC in three dimensions. On (\ref{['preli:fig:geoSSC:conC']}): the sets $\Delta^3$ and $\mathcal{C}$, they intersect at (0,0.5,0.5), (0.5,0,0.5), and (0.5,0.5,0). On (\ref{['preli:fig:geoSSC:ssc']}), (\ref{['preli:fig:geoSSC:not-ssc1']}) and (\ref{['preli:fig:geoSSC:ssc1-not-ssc2']}): examples of a matrix $H \in \mathbb{R}^{3 \times n}$ respectively satisfying the SSC, not satisfying SSC1 and satisfying SSC1 but not SSC2.
  • Figure 4: Influence of centering the data on the cost function topology regarding $H$ via a small example ($m=2,r=2,n=1$). Top: without centering. Bottom: with centering. Five projected gradient steps are shown, decomposed through one gradient descent step followed by its projection onto the feasible set.
  • Figure 5: Evolution of the training error for ml-1m and MNIST, averaged on 10 runs. For ml-1m, $r=5$, 1 inner iteration. For MNIST, $r=50$, 10 inner iterations.
  • ...and 27 more figures

Theorems & Definitions (52)

  • Definition 2.1: Exact NMF
  • Definition 2.2: NMF
  • Definition 2.3: Probability simplex
  • Definition 2.4: Exact SSMF
  • Definition 2.5: SSMF
  • Definition 2.6: Factorization model
  • Definition 2.7: Identifiability / Essential uniqueness
  • Theorem 2.1
  • Definition 2.8: Sufficiently scattered condition
  • Lemma 2.1
  • ...and 42 more