Applying non-negative matrix factorization with covariates to label matrix for classification

Kenichi Satoh

Applying non-negative matrix factorization with covariates to label matrix for classification

Kenichi Satoh

TL;DR

NMF-LAB recasts classification as the inverse problem of non-negative matrix tri-factorization, yielding a direct, probabilistic mapping from covariates to class labels without an external classifier. By normalizing the factorization coefficients, it outputs class-membership probabilities and supports seamless generalization to unseen covariates, enabling semi-supervised learning. The framework offers two variants: a linear, interpretable direct model and a kernel-based nonlinear model that achieves competitive accuracy on diverse datasets, including MNIST via Nyström approximation. Empirical results demonstrate favorable predictive performance, robustness to label noise, interpretability trade-offs, and scalable applicability to high-dimensional data, establishing a unified, probabilistic approach to regression and classification within the tri-NMF paradigm.

Abstract

Non-negative matrix factorization (NMF) is widely used for dimensionality reduction and interpretable analysis, but standard formulations are unsupervised and cannot directly exploit class labels. Existing supervised or semi-supervised extensions usually incorporate labels only via penalties or graph constraints, still requiring an external classifier. We propose \textit{NMF-LAB} (Non-negative Matrix Factorization for Label Matrix), which redefines classification as the inverse problem of non-negative matrix tri-factorization (tri-NMF). Unlike joint NMF methods, which reconstruct both features and labels, NMF-LAB directly factorizes the label matrix $Y$ as the observation, while covariates $A$ are treated as given explanatory variables. This yields a direct probabilistic mapping from covariates to labels, distinguishing our method from label-matrix factorization approaches that mainly model label correlations or impute missing labels. Our inversion offers two key advantages: (i) class-membership probabilities are obtained directly from the factorization without a separate classifier, and (ii) covariates, including kernel-based similarities, can be seamlessly integrated to generalize predictions to unseen samples. In addition, unlabeled data can be encoded as uniform distributions, supporting semi-supervised learning. Experiments on diverse datasets, from small-scale benchmarks to the large-scale MNIST dataset, demonstrate that NMF-LAB achieves competitive predictive accuracy, robustness to noisy or incomplete labels, and scalability to high-dimensional problems, while preserving interpretability. By unifying regression and classification within the tri-NMF framework, NMF-LAB provides a novel, probabilistic, and scalable approach to modern classification tasks.

Applying non-negative matrix factorization with covariates to label matrix for classification

TL;DR

Abstract

as the observation, while covariates

are treated as given explanatory variables. This yields a direct probabilistic mapping from covariates to labels, distinguishing our method from label-matrix factorization approaches that mainly model label correlations or impute missing labels. Our inversion offers two key advantages: (i) class-membership probabilities are obtained directly from the factorization without a separate classifier, and (ii) covariates, including kernel-based similarities, can be seamlessly integrated to generalize predictions to unseen samples. In addition, unlabeled data can be encoded as uniform distributions, supporting semi-supervised learning. Experiments on diverse datasets, from small-scale benchmarks to the large-scale MNIST dataset, demonstrate that NMF-LAB achieves competitive predictive accuracy, robustness to noisy or incomplete labels, and scalability to high-dimensional problems, while preserving interpretability. By unifying regression and classification within the tri-NMF framework, NMF-LAB provides a novel, probabilistic, and scalable approach to modern classification tasks.

Applying non-negative matrix factorization with covariates to label matrix for classification

TL;DR

Abstract

Applying non-negative matrix factorization with covariates to label matrix for classification

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (2)