Table of Contents
Fetching ...

Graph Regularized NMF with L20-norm for Unsupervised Feature Learning

Zhen Wang, Wenwen Min

TL;DR

An unsupervised feature learning framework based on GNMF is proposed and an algorithm based on PALM and its accelerated version to address the sensitivity of GNMF to noise is devised to enhance feature sparsity.

Abstract

Nonnegative Matrix Factorization (NMF) is a widely applied technique in the fields of machine learning and data mining. Graph Regularized Non-negative Matrix Factorization (GNMF) is an extension of NMF that incorporates graph regularization constraints. GNMF has demonstrated exceptional performance in clustering and dimensionality reduction, effectively discovering inherent low-dimensional structures embedded within high-dimensional spaces. However, the sensitivity of GNMF to noise limits its stability and robustness in practical applications. In order to enhance feature sparsity and mitigate the impact of noise while mining row sparsity patterns in the data for effective feature selection, we introduce the $\ell_{2,0}$-norm constraint as the sparsity constraints for GNMF. We propose an unsupervised feature learning framework based on GNMF\_$\ell_{20}$ and devise an algorithm based on PALM and its accelerated version to address this problem. Additionally, we establish the convergence of the proposed algorithms and validate the efficacy and superiority of our approach through experiments conducted on both simulated and real image data.

Graph Regularized NMF with L20-norm for Unsupervised Feature Learning

TL;DR

An unsupervised feature learning framework based on GNMF is proposed and an algorithm based on PALM and its accelerated version to address the sensitivity of GNMF to noise is devised to enhance feature sparsity.

Abstract

Nonnegative Matrix Factorization (NMF) is a widely applied technique in the fields of machine learning and data mining. Graph Regularized Non-negative Matrix Factorization (GNMF) is an extension of NMF that incorporates graph regularization constraints. GNMF has demonstrated exceptional performance in clustering and dimensionality reduction, effectively discovering inherent low-dimensional structures embedded within high-dimensional spaces. However, the sensitivity of GNMF to noise limits its stability and robustness in practical applications. In order to enhance feature sparsity and mitigate the impact of noise while mining row sparsity patterns in the data for effective feature selection, we introduce the -norm constraint as the sparsity constraints for GNMF. We propose an unsupervised feature learning framework based on GNMF\_ and devise an algorithm based on PALM and its accelerated version to address this problem. Additionally, we establish the convergence of the proposed algorithms and validate the efficacy and superiority of our approach through experiments conducted on both simulated and real image data.
Paper Structure (13 sections, 5 theorems, 84 equations, 4 figures, 3 tables, 3 algorithms)

This paper contains 13 sections, 5 theorems, 84 equations, 4 figures, 3 tables, 3 algorithms.

Key Result

Lemma 1

(Convergence properties) Suppose that Assumption assumption-1 holds. Let $\{(\boldsymbol{W}^k,\boldsymbol{H}^k)\}$ be a sequence generated by Algorithm alg-3. The following assertions hold. (i) The sequences $J{(\boldsymbol{W}^k,\boldsymbol{H}^k)}$ is monotonically nonincreasing and in particular where $\rho_{0}=\min{\{\frac{1}{2}(c_{k}-L_W),\frac{1}{2}(d_{k}-L_H)}\}$. (ii) We have

Figures (4)

  • Figure 1: GNMF$\_\ell_{20}$ matrix factorization and clustering process. The input of GNMF$\_\ell_{20}$ is a two-dimensional matrix $\bm{X}$. The adjacency matrix $\bm{A}$ is constructed through the data matrix $\bm{X}$ to represent the correlation between data points. GNMF$\_\ell_{20}$ decomposes $\bm{X}$ into two low-dimensional matrices, namely the basis matrix $\bm{W}$ and the coefficient matrix $\bm{H}$. The $\bm{H}$ matrix is useful for sample clustering and visualization in low-dimensional space, while the $\hat{\bm{X}}=\bm{W}\bm{H}$ can be used for downstream analysis.
  • Figure 2: Results on the synthetic data. (A) Heatmap showing the synthetic data. (B) Convergence performance of PALM and accPALM for GNMF_$\ell_{20}$ and the initial value of the inertial parameter $\beta$ of the acceleration method is 0.5. (C) Comparison of nine unsupervised clustering methods in terms of NMI on the synthetic data.
  • Figure 3: Convergence performance of accPALM with different parameters $\beta$ for GNMF_$\ell_{20}$, where the first four lines represent the convergence performance when the extrapolation parameter $\beta$ is fixed, and the last line represents the convergence performance when the extrapolation parameter $\beta$ changes dynamically as the iteration proceeds.
  • Figure 4: 2D visualization results via t-SNE. The comparison of the raw data and the clustering results of GNMF_$\ell_{20}$ on LIBRAS, UMIST and JAFFE datasets.

Theorems & Definitions (10)

  • Lemma 1
  • proof
  • Lemma 2
  • proof
  • Lemma 3
  • proof
  • Theorem 1
  • proof
  • Theorem 2
  • proof