On the Connection Between Non-negative Matrix Factorization and Latent Dirichlet Allocation

Benedikt Geiger; Peter J. Park

On the Connection Between Non-negative Matrix Factorization and Latent Dirichlet Allocation

Benedikt Geiger, Peter J. Park

TL;DR

This work investigates how non-negative matrix factorization (NMF) with the generalized KL divergence relates to topic models such as PLSA and LDA. It shows that enforcing column-normalization on the factor matrices and introducing a Dirichlet prior on the topic-proportion matrix yields exact algorithmic equivalences: NMF with KL loss becomes LDA, and NMF with normalization only on W aligns with PLSA. The authors derive joint multiplicative updates that update both factor matrices simultaneously, reducing computational cost, and analyze sparse variants showing that a straight \\ell_1 penalty on H does not reliably induce sparsity under KL loss. By unifying the optimization and probabilistic perspectives, the paper provides a deeper, more versatile understanding of how NMF and topic models interrelate, with implications for many domains in unsupervised learning and text mining.

Abstract

Non-negative matrix factorization with the generalized Kullback-Leibler divergence (NMF) and latent Dirichlet allocation (LDA) are two popular approaches for dimensionality reduction of non-negative data. Here, we show that NMF with $\ell_1$ normalization constraints on the columns of both matrices of the decomposition and a Dirichlet prior on the columns of one matrix is equivalent to LDA. To show this, we demonstrate that explicitly accounting for the scaling ambiguity of NMF by adding $\ell_1$ normalization constraints to the optimization problem allows a joint update of both matrices in the widely used multiplicative updates (MU) algorithm. When both of the matrices are normalized, the joint MU algorithm leads to probabilistic latent semantic analysis (PLSA), which is LDA without a Dirichlet prior. Our approach of deriving joint updates for NMF also reveals that a Lasso penalty on one matrix together with an $\ell_1$ normalization constraint on the other matrix is insufficient to induce any sparsity.

On the Connection Between Non-negative Matrix Factorization and Latent Dirichlet Allocation

TL;DR

Abstract

normalization constraints on the columns of both matrices of the decomposition and a Dirichlet prior on the columns of one matrix is equivalent to LDA. To show this, we demonstrate that explicitly accounting for the scaling ambiguity of NMF by adding

normalization constraints to the optimization problem allows a joint update of both matrices in the widely used multiplicative updates (MU) algorithm. When both of the matrices are normalized, the joint MU algorithm leads to probabilistic latent semantic analysis (PLSA), which is LDA without a Dirichlet prior. Our approach of deriving joint updates for NMF also reveals that a Lasso penalty on one matrix together with an

normalization constraint on the other matrix is insufficient to induce any sparsity.

Paper Structure (17 sections, 18 theorems, 88 equations, 7 algorithms)

This paper contains 17 sections, 18 theorems, 88 equations, 7 algorithms.

Introduction
Related work
Bayesian NMF.
Preliminaries
Non-negative matrix factorization (NMF)
Probabilistic latent semantic analysis (PLSA)
Latent Dirichlet allocation (LDA)
Algorithms for NMF with normalization constraints
Connection between NMF and LDA
Sparse NMF
Conclusion
Auxiliary functions for NMF
NMF with normalization constraints
Dirichlet--Poisson model
Equivalent generative models
...and 2 more sections

Key Result

Lemma 4.1

Let $(W, H)$ be a solution of the standard NMF optimization problem optimization_problem_nmf and let $\lambda_k = \sum_v w_{vk}$. Then $(\widetilde{W}, \widetilde{H})$ with $\widetilde{w}_{vk} = w_{vk} / \lambda_k$ and $\widetilde{h}_{kd} = \lambda_k h_{kd}$ is a solution of NMF with a normalization

Theorems & Definitions (35)

Lemma 4.1
Lemma 4.2
Lemma 6.1
Lemma A.1: lee2000algorithms
proof
Lemma A.2
proof
Corollary A.3: lee2000algorithms
proof
Lemma 4.2
...and 25 more

On the Connection Between Non-negative Matrix Factorization and Latent Dirichlet Allocation

TL;DR

Abstract

On the Connection Between Non-negative Matrix Factorization and Latent Dirichlet Allocation

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (35)