Table of Contents
Fetching ...

Hierarchical Matrix Factorization for Interpretable Collaborative Filtering

Kai Sugahara, Kazushi Okamoto

TL;DR

Hierarchical Matrix Factorization is proposed, which incorporates clustering concepts to capture the hierarchy, where leaf nodes and other nodes correspond to users/items and clusters, respectively, and the obtained cluster-specific interactions naturally summarize user-item interactions and provide interpretability.

Abstract

Matrix factorization (MF) is a simple collaborative filtering technique that achieves superior recommendation accuracy by decomposing the user-item interaction matrix into user and item latent matrices. Because the model typically learns each interaction independently, it may overlook the underlying shared dependencies between users and items, resulting in less stable and interpretable recommendations. Based on these insights, we propose "Hierarchical Matrix Factorization" (HMF), which incorporates clustering concepts to capture the hierarchy, where leaf nodes and other nodes correspond to users/items and clusters, respectively. Central to our approach, called hierarchical embeddings, is the additional decomposition of the latent matrices (embeddings) into probabilistic connection matrices, which link the hierarchy, and a root cluster latent matrix. The embeddings are differentiable, allowing simultaneous learning of interactions and clustering using a single gradient descent method. Furthermore, the obtained cluster-specific interactions naturally summarize user-item interactions and provide interpretability. Experimental results on ratings and ranking predictions show that HMF outperforms existing MF methods, in particular achieving a 1.37 point improvement in RMSE for sparse interactions. Additionally, it was confirmed that the clustering integration of HMF has the potential for faster learning convergence and mitigation of overfitting compared to MF, and also provides interpretability through a cluster-centered case study.

Hierarchical Matrix Factorization for Interpretable Collaborative Filtering

TL;DR

Hierarchical Matrix Factorization is proposed, which incorporates clustering concepts to capture the hierarchy, where leaf nodes and other nodes correspond to users/items and clusters, respectively, and the obtained cluster-specific interactions naturally summarize user-item interactions and provide interpretability.

Abstract

Matrix factorization (MF) is a simple collaborative filtering technique that achieves superior recommendation accuracy by decomposing the user-item interaction matrix into user and item latent matrices. Because the model typically learns each interaction independently, it may overlook the underlying shared dependencies between users and items, resulting in less stable and interpretable recommendations. Based on these insights, we propose "Hierarchical Matrix Factorization" (HMF), which incorporates clustering concepts to capture the hierarchy, where leaf nodes and other nodes correspond to users/items and clusters, respectively. Central to our approach, called hierarchical embeddings, is the additional decomposition of the latent matrices (embeddings) into probabilistic connection matrices, which link the hierarchy, and a root cluster latent matrix. The embeddings are differentiable, allowing simultaneous learning of interactions and clustering using a single gradient descent method. Furthermore, the obtained cluster-specific interactions naturally summarize user-item interactions and provide interpretability. Experimental results on ratings and ranking predictions show that HMF outperforms existing MF methods, in particular achieving a 1.37 point improvement in RMSE for sparse interactions. Additionally, it was confirmed that the clustering integration of HMF has the potential for faster learning convergence and mitigation of overfitting compared to MF, and also provides interpretability through a cluster-centered case study.
Paper Structure (21 sections, 8 equations, 2 figures, 6 tables)

This paper contains 21 sections, 8 equations, 2 figures, 6 tables.

Figures (2)

  • Figure 1: The differences between HMF and MF model architectures illustrated by an example of rating prediction for User 2 and Item 0. Gray highlights indicate the model parameters trained on a dataset. MF has the latent vectors for each user and item, whereas HMF has latent variables in the root clusters and the connection parameters between levels.
  • Figure 2: MF and HMF losses (shown in RMSE) per epoch on the validation subsets. The best hyperparameter setting was selected for each weight decay, showing the change in its loss. Note that if the validation loss increased for five consecutive epochs, the optimization was terminated.