Table of Contents
Fetching ...

An Accelerated Alternating Partial Bregman Algorithm for ReLU-based Matrix Decomposition

Qingsong Wang, Yunfei Qu, Chunfeng Cui, Deren Han

TL;DR

The paper tackles the challenge of extracting intrinsic low-rank structure from nonnegative sparse matrices by studying a ReLU-based nonlinear matrix decomposition where $M$ is approximated via $\max(0,UV)$ with regularizers. It introduces the accelerated alternating partial Bregman (AAPB) algorithm to solve this multi-block, nonconvex problem, achieving simultaneous updates of $(U,V)$ and a closed-form $W$-subproblem, with convergence guarantees under a KL framework and the L-smad property. The authors provide a detailed convergence analysis and closed-form solutions for various regularization regimes, then validate the approach through numerical experiments on graph-regularized clustering and sparse NMF basis compression, showing superior performance and faster convergence, especially in high-sparsity settings. The work offers a scalable and theoretically grounded framework for ReLU-based factorization that extends to several regularizers and practical applications in clustering and dictionary compression.

Abstract

Despite the remarkable success of low-rank estimation in data mining, its effectiveness diminishes when applied to data that inherently lacks low-rank structure. To address this limitation, in this paper, we focus on non-negative sparse matrices and aim to investigate the intrinsic low-rank characteristics of the rectified linear unit (ReLU) activation function. We first propose a novel nonlinear matrix decomposition framework incorporating a comprehensive regularization term designed to simultaneously promote useful structures in clustering and compression tasks, such as low-rankness, sparsity, and non-negativity in the resulting factors. This formulation presents significant computational challenges due to its multi-block structure, non-convexity, non-smoothness, and the absence of global gradient Lipschitz continuity. To address these challenges, we develop an accelerated alternating partial Bregman proximal gradient method (AAPB), whose distinctive feature lies in its capability to enable simultaneous updates of multiple variables. Under mild and theoretically justified assumptions, we establish both sublinear and global convergence properties of the proposed algorithm. Through careful selection of kernel generating distances tailored to various regularization terms, we derive corresponding closed-form solutions while maintaining the $L$-smooth adaptable property always holds for any $L\ge 1$. Numerical experiments, on graph regularized clustering and sparse NMF basis compression confirm the effectiveness of our model and algorithm.

An Accelerated Alternating Partial Bregman Algorithm for ReLU-based Matrix Decomposition

TL;DR

The paper tackles the challenge of extracting intrinsic low-rank structure from nonnegative sparse matrices by studying a ReLU-based nonlinear matrix decomposition where is approximated via with regularizers. It introduces the accelerated alternating partial Bregman (AAPB) algorithm to solve this multi-block, nonconvex problem, achieving simultaneous updates of and a closed-form -subproblem, with convergence guarantees under a KL framework and the L-smad property. The authors provide a detailed convergence analysis and closed-form solutions for various regularization regimes, then validate the approach through numerical experiments on graph-regularized clustering and sparse NMF basis compression, showing superior performance and faster convergence, especially in high-sparsity settings. The work offers a scalable and theoretically grounded framework for ReLU-based factorization that extends to several regularizers and practical applications in clustering and dictionary compression.

Abstract

Despite the remarkable success of low-rank estimation in data mining, its effectiveness diminishes when applied to data that inherently lacks low-rank structure. To address this limitation, in this paper, we focus on non-negative sparse matrices and aim to investigate the intrinsic low-rank characteristics of the rectified linear unit (ReLU) activation function. We first propose a novel nonlinear matrix decomposition framework incorporating a comprehensive regularization term designed to simultaneously promote useful structures in clustering and compression tasks, such as low-rankness, sparsity, and non-negativity in the resulting factors. This formulation presents significant computational challenges due to its multi-block structure, non-convexity, non-smoothness, and the absence of global gradient Lipschitz continuity. To address these challenges, we develop an accelerated alternating partial Bregman proximal gradient method (AAPB), whose distinctive feature lies in its capability to enable simultaneous updates of multiple variables. Under mild and theoretically justified assumptions, we establish both sublinear and global convergence properties of the proposed algorithm. Through careful selection of kernel generating distances tailored to various regularization terms, we derive corresponding closed-form solutions while maintaining the -smooth adaptable property always holds for any . Numerical experiments, on graph regularized clustering and sparse NMF basis compression confirm the effectiveness of our model and algorithm.

Paper Structure

This paper contains 13 sections, 4 theorems, 58 equations, 5 figures, 6 tables, 3 algorithms.

Key Result

Theorem 1

(Subsequence convergence of Algorithm AAPB) Assume Assumptions assumption_01 and assumption_02 hold, and $0<\lambda\le1/L$. Let $\{Y^{k}\}_{k\in\mathbb{N}}$ be the sequence generated by the NMD-AAPB algorithm. Then the following statements hold.

Figures (5)

  • Figure 1: Numerical experiments of real-world datasets for solving \ref{['NMF_com']}.
  • Figure 2: Original factor $\tilde{U}$ of NMF for YaleB dataset, with rank-$r=81$ and low-rank reconstruction by TSVD BoutsidisM14, NMD-APB and NMD-AAPB with fixed rank-$r=55$.
  • Figure 3: The $80$-th basis of the factor $U$ in Figure \ref{['yaleb_res']}. Left to right: Original, TSVD BoutsidisM14, NMD-APB, NMD-AAPB.
  • Figure 4: Numerical results for solving \ref{['L1-L1-model']} with $m=1500,n=500, r^*=10$ by synthetic datasets.
  • Figure 5: Numerical results for solving \ref{['L1-L1-model']} by two real-world datasets.

Theorems & Definitions (18)

  • Definition 1
  • Definition 2
  • Definition 3
  • Definition 4
  • Definition 5
  • Definition 6
  • Definition 7
  • Definition 8
  • Definition 9
  • Remark 1
  • ...and 8 more