Learnable Similarity and Dissimilarity Guided Symmetric Non-Negative Matrix Factorization
Wenlong Lyu, Yuheng Jia
TL;DR
This paper tackles the sensitivity of SymNMF to the choice of $k$ in $k$-NN graph construction by learning a weighted, low-dimensional combination of $k$-NN slices to form a data-driven similarity $S(w)$ and introducing a dual dissimilarity $D(p)$ to enhance discriminability. It also introduces a novel orthogonality regularization $\mathcal{R}(V)$ with column-wise updates and convergence guarantees under the PHALS framework, enabling a provably convergent alternating optimization. The approach achieves superior clustering performance across eight datasets compared to both fixed and adaptive similarity methods, and provides clear insights into the learned coefficients $w$ and $p$, which adaptively select reliable neighbors and farthest relations. The work offers practical significance by reducing the search space to $n-1$ dimensions, improving robustness to misleading neighbor relations, and delivering an efficient algorithm with convergence guarantees for symmetric NMF-based clustering tasks.
Abstract
Symmetric nonnegative matrix factorization (SymNMF) is a powerful tool for clustering, which typically uses the $k$-nearest neighbor ($k$-NN) method to construct similarity matrix. However, $k$-NN may mislead clustering since the neighbors may belong to different clusters, and its reliability generally decreases as $k$ grows. In this paper, we construct the similarity matrix as a weighted $k$-NN graph with learnable weight that reflects the reliability of each $k$-th NN. This approach reduces the search space of the similarity matrix learning to $n - 1$ dimension, as opposed to the $\mathcal{O}(n^2)$ dimension of existing methods, where $n$ represents the number of samples. Moreover, to obtain a discriminative similarity matrix, we introduce a dissimilarity matrix with a dual structure of the similarity matrix, and propose a new form of orthogonality regularization with discussions on its geometric interpretation and numerical stability. An efficient alternative optimization algorithm is designed to solve the proposed model, with theoretically guarantee that the variables converge to a stationary point that satisfies the KKT conditions. The advantage of the proposed model is demonstrated by the comparison with nine state-of-the-art clustering methods on eight datasets. The code is available at \url{https://github.com/lwl-learning/LSDGSymNMF}.
