Table of Contents
Fetching ...

Persistent Nonnegative Matrix Factorization via Multi-Scale Graph Regularization

Jichao Zhang, Ran Miao, Limin Li

TL;DR

This work proposes persistent nonnegative matrix factorization (pNMF), a scale-parameterized family of NMF problems that produces a sequence of persistence-aligned embeddings rather than a single one, and develops a sequential alternating optimization algorithm with guaranteed convergence.

Abstract

Matrix factorization techniques, especially Nonnegative Matrix Factorization (NMF), have been widely used for dimensionality reduction and interpretable data representation. However, existing NMF-based methods are inherently single-scale and fail to capture the evolution of connectivity structures across resolutions. In this work, we propose persistent nonnegative matrix factorization (pNMF), a scale-parameterized family of NMF problems, that produces a sequence of persistence-aligned embeddings rather than a single one. By leveraging persistent homology, we identify a canonical minimal sufficient scale set at which the underlying connectivity undergoes qualitative changes. These canonical scales induce a sequence of graph Laplacians, leading to a coupled NMF formulation with scale-wise geometric regularization and explicit cross-scale consistency constraint. We analyze the structural properties of the embeddings along the scale parameter and establish bounds on their increments between consecutive scales. The resulting model defines a nontrivial solution path across scales, rather than a single factorization, which poses new computational challenges. We develop a sequential alternating optimization algorithm with guaranteed convergence. Numerical experiments on synthetic and single-cell RNA sequencing datasets demonstrate the effectiveness of the proposed approach in multi-scale low-rank embeddings.

Persistent Nonnegative Matrix Factorization via Multi-Scale Graph Regularization

TL;DR

This work proposes persistent nonnegative matrix factorization (pNMF), a scale-parameterized family of NMF problems that produces a sequence of persistence-aligned embeddings rather than a single one, and develops a sequential alternating optimization algorithm with guaranteed convergence.

Abstract

Matrix factorization techniques, especially Nonnegative Matrix Factorization (NMF), have been widely used for dimensionality reduction and interpretable data representation. However, existing NMF-based methods are inherently single-scale and fail to capture the evolution of connectivity structures across resolutions. In this work, we propose persistent nonnegative matrix factorization (pNMF), a scale-parameterized family of NMF problems, that produces a sequence of persistence-aligned embeddings rather than a single one. By leveraging persistent homology, we identify a canonical minimal sufficient scale set at which the underlying connectivity undergoes qualitative changes. These canonical scales induce a sequence of graph Laplacians, leading to a coupled NMF formulation with scale-wise geometric regularization and explicit cross-scale consistency constraint. We analyze the structural properties of the embeddings along the scale parameter and establish bounds on their increments between consecutive scales. The resulting model defines a nontrivial solution path across scales, rather than a single factorization, which poses new computational challenges. We develop a sequential alternating optimization algorithm with guaranteed convergence. Numerical experiments on synthetic and single-cell RNA sequencing datasets demonstrate the effectiveness of the proposed approach in multi-scale low-rank embeddings.
Paper Structure (45 sections, 16 theorems, 166 equations, 10 figures, 1 table, 1 algorithm)

This paper contains 45 sections, 16 theorems, 166 equations, 10 figures, 1 table, 1 algorithm.

Key Result

Theorem 3.5

The scale set $\Lambda^\star$ is the canonical minimal sufficient scale set.

Figures (10)

  • Figure 1: Illustration of the construction of multi-scale topological graphs from persistent homology. (a) Vietoris--Rips complex $\mathcal{X}(\varepsilon)$ on a dataset $X \in \mathbb{R}^{2 \times 10}$. (b) The associated $0$-dimensional persistence diagram $\mathcal{D}_0$, encoding the death times of connected components. (c) A sequence of weighted graphs defined at the persistence-based scales $\{\varepsilon_t\}$, which serve as the multi-scale topological graphs.
  • Figure 1: Multi-scale embeddings visualization on simulation data. The first panel shows the intrinsic three-dimensional simulation data before embedding into a 100-dimensional ambient space. The remaining nine panels display the corresponding two-dimensional embeddings $\{H_t\}_{t=1}^{80}$ uniformly sampled across scales.
  • Figure 1: Multi-scale clustering Accuracy across different scales $t$. The curves compare pNMF and baseline methods in terms of Accuracy as the scale index varies.
  • Figure 1: Empirical convergence behavior of pNMF on the simulation dataset. (a) Objective function value versus outer iteration. (b) Relative decrease of the objective function between successive outer iterations. (c) Fine-grained evolution of the objective function across all update steps in the outer--scale--inner optimization process.
  • Figure 2: Effect of persistence-based scale selection on multi-scale embeddings visualization. Each column corresponds to the same scale index, while each row represents a different scale selection strategy, ordered as pNMF, pNMF-UDS, pNMF-RDS, and pNMF-MSS.
  • ...and 5 more figures

Theorems & Definitions (36)

  • Definition 3.1: Sufficient Scale Set
  • Definition 3.2: Distance-Scale Set
  • Definition 3.3: Minimal Sufficient Scale Set
  • Definition 3.4: Canonical Minimal Sufficient Scale Set
  • Theorem 3.5
  • Proof 1
  • Theorem 3.6: Piecewise Lipschitz Continuity of Laplacian
  • Proof 2
  • Theorem 3.7: Lipschitz Bound Between Adjacent Scales
  • Proof 3
  • ...and 26 more