Table of Contents
Fetching ...

Fast and Effective Computation of Generalized Symmetric Matrix Factorization

Lei Yang, Han Wan, Min Zhang, Ling Liang

Abstract

In this paper, we study a nonconvex, nonsmooth, and non-Lipschitz generalized symmetric matrix factorization model that unifies a broad class of matrix factorization formulations arising in machine learning, image science, engineering, and related areas. We first establish two exactness properties. On the modeling side, we prove an exact penalty property showing that, under suitable conditions, the symmetry-inducing quadratic penalty enforces symmetry whenever the penalty parameter is sufficiently large but finite, thereby exactly recovering the associated symmetric formulation. On the algorithmic side, we introduce an auxiliary-variable splitting formulation and establish an exact relaxation relationship that rigorously links stationary points of the original objective function to those of a relaxed potential function. Building on these exactness properties, we propose an average-type nonmonotone alternating updating method (A-NAUM) based on the relaxed potential function. At each iteration, A-NAUM alternately updates the two factor blocks by (approximately) minimizing the potential function, while the auxiliary block is updated in closed form. To ensure the convergence and enhance practical performance, we further incorporate an average-type nonmonotone line search and show that it is well-defined under mild conditions. Moreover, based on the Kurdyka-Łojasiewicz property and its associated exponent, we establish global convergence of the entire sequence to a stationary point and derive convergence rate results. Finally, numerical experiments on real datasets demonstrate the efficiency of A-NAUM.

Fast and Effective Computation of Generalized Symmetric Matrix Factorization

Abstract

In this paper, we study a nonconvex, nonsmooth, and non-Lipschitz generalized symmetric matrix factorization model that unifies a broad class of matrix factorization formulations arising in machine learning, image science, engineering, and related areas. We first establish two exactness properties. On the modeling side, we prove an exact penalty property showing that, under suitable conditions, the symmetry-inducing quadratic penalty enforces symmetry whenever the penalty parameter is sufficiently large but finite, thereby exactly recovering the associated symmetric formulation. On the algorithmic side, we introduce an auxiliary-variable splitting formulation and establish an exact relaxation relationship that rigorously links stationary points of the original objective function to those of a relaxed potential function. Building on these exactness properties, we propose an average-type nonmonotone alternating updating method (A-NAUM) based on the relaxed potential function. At each iteration, A-NAUM alternately updates the two factor blocks by (approximately) minimizing the potential function, while the auxiliary block is updated in closed form. To ensure the convergence and enhance practical performance, we further incorporate an average-type nonmonotone line search and show that it is well-defined under mild conditions. Moreover, based on the Kurdyka-Łojasiewicz property and its associated exponent, we establish global convergence of the entire sequence to a stationary point and derive convergence rate results. Finally, numerical experiments on real datasets demonstrate the efficiency of A-NAUM.
Paper Structure (32 sections, 16 theorems, 141 equations, 2 figures, 1 table, 1 algorithm)

This paper contains 32 sections, 16 theorems, 141 equations, 2 figures, 1 table, 1 algorithm.

Key Result

Proposition 2.1

Suppose that $h: \mathbb{R}^{m\times n} \rightarrow \mathbb{R} \cup \{+\infty\}$ is a proper closed function and $\Gamma$ is a compact set. If $h \equiv \zeta$ on $\Gamma$ for some constant $\zeta$ and satisfies the KL property at each point of $\Gamma$, then there exist $\varepsilon>0$, $\nu>0$, an for all $X \in \{X\in\mathbb{R}^{m\times n}: \operatorname{dist}(X,\,\Gamma)<\varepsilon\} \cap \{X

Figures (2)

  • Figure 1: Average relative objective values versus running time (over 5 independent runs) for different factorization ranks $r\in\{5,50,150\}$ and noise levels $t\in\{0.001,0.01,0.1\}$ on ORL.
  • Figure 2: Average relative objective values versus running time (over 5 independent runs) for different penalty parameters $\lambda\in\{0.01,1,100\}$ on ORL and CBCL.

Theorems & Definitions (36)

  • Definition 2.1: KL property and exponent
  • Proposition 2.1: Uniformized KL property
  • Theorem 3.1
  • proof
  • Proposition 3.1
  • proof
  • Theorem 3.2
  • proof
  • Theorem 3.3
  • proof
  • ...and 26 more