Table of Contents
Fetching ...

Revealing and Utilizing In-group Favoritism for Graph-based Collaborative Filtering

Hoin Jung, Hyunsoo Cho, Myungje Choi, Joowon Lee, Jung Ho Park, Myungjoo Kang

TL;DR

The paper tackles the challenge of capturing in-group favoritism in personalized recommendations by introducing the Co-Clustering Wrapper (CCW), which applies spectral co-clustering to the user-item bipartite graph to form $k$ co-clusters and trains a global CF model plus $k$ cluster-local models. It fuses global and local signals through a Local Importance Coefficient (LIC), enabling same-cluster interactions to benefit from localized embeddings via a ranking score $\hat{y}_{u,i} = e_{u,g}^T e_{i,g} + (\text{LIC}_u \cdot e_{u,l})^T (\text{LIC}_i \cdot e_{i,l})$, while cross-cluster pairs rely on global embeddings. The approach is validated on four public datasets with five baseline CF models, showing consistent gains in Recall@20 and NDCG@20 and demonstrating the utility of variance-ratio based cluster selection. CCW is model-agnostic and highlights practical benefits of incorporating locality and globality in graph-based recommender systems; future work includes neural-network-based co-clustering and synthetic graph data generation to stress-test GC models.

Abstract

When it comes to a personalized item recommendation system, It is essential to extract users' preferences and purchasing patterns. Assuming that users in the real world form a cluster and there is common favoritism in each cluster, in this work, we introduce Co-Clustering Wrapper (CCW). We compute co-clusters of users and items with co-clustering algorithms and add CF subnetworks for each cluster to extract the in-group favoritism. Combining the features from the networks, we obtain rich and unified information about users. We experimented real world datasets considering two aspects: Finding the number of groups divided according to in-group preference, and measuring the quantity of improvement of the performance.

Revealing and Utilizing In-group Favoritism for Graph-based Collaborative Filtering

TL;DR

The paper tackles the challenge of capturing in-group favoritism in personalized recommendations by introducing the Co-Clustering Wrapper (CCW), which applies spectral co-clustering to the user-item bipartite graph to form co-clusters and trains a global CF model plus cluster-local models. It fuses global and local signals through a Local Importance Coefficient (LIC), enabling same-cluster interactions to benefit from localized embeddings via a ranking score , while cross-cluster pairs rely on global embeddings. The approach is validated on four public datasets with five baseline CF models, showing consistent gains in Recall@20 and NDCG@20 and demonstrating the utility of variance-ratio based cluster selection. CCW is model-agnostic and highlights practical benefits of incorporating locality and globality in graph-based recommender systems; future work includes neural-network-based co-clustering and synthetic graph data generation to stress-test GC models.

Abstract

When it comes to a personalized item recommendation system, It is essential to extract users' preferences and purchasing patterns. Assuming that users in the real world form a cluster and there is common favoritism in each cluster, in this work, we introduce Co-Clustering Wrapper (CCW). We compute co-clusters of users and items with co-clustering algorithms and add CF subnetworks for each cluster to extract the in-group favoritism. Combining the features from the networks, we obtain rich and unified information about users. We experimented real world datasets considering two aspects: Finding the number of groups divided according to in-group preference, and measuring the quantity of improvement of the performance.
Paper Structure (18 sections, 13 equations, 6 figures, 3 tables, 1 algorithm)

This paper contains 18 sections, 13 equations, 6 figures, 3 tables, 1 algorithm.

Figures (6)

  • Figure 1: Comparison of our proposed CCW models and other base models. Our CCW wrapper reveals and utilizes in-group favoritism in recommendation data, which helps the base model achieve better performance for Recall@20 and NDCG@20 scores.
  • Figure 2: Co-clustering results of incidence matrix for Yelp2018 Dataset. For each plot, rows represent users, and columns represent items. (a) shows the plot of the incidence matrix of Yelp2018. From (b) to (f), we show the clustered results according to the value of $K$, which is the number of clusters we want to make. Each blue area in the plot shows the computed co-cluster.
  • Figure 3: Overall flow chart of Co-Clustering Wrapper. First, we apply SCC to the rating matrix, obtaining $k$ co-clusters. (In this figure, we set by $k=3$.) Next, we apply CF models to the entire rating matrix and co-clustered rating matrix in parallel to compute the global and local embeddings. After rearranging the local embeddings to the original order, we aggregate the local and global embeddings, then compute the estimated rating $\hat{Y}$.
  • Figure 4: Plot of mean variance ratio of $k$-clustering of Yelp2018. As $k$ increases, so does the variance ratio. One can discover that the variance ratio hardly changes if $k\geq 9$. Based on the plot, we conclude that $k=9$ is a proper choice for clustering.
  • Figure 5: UMAP Results for Yelp2018. The first two figures show the result of applying UMAP to nodes of items of the dataset. The last two figures are the results from the nodes of items. For (a) and (c), points with the same color belong to the same cluster. These clusters are obtained from the result of spectral co-clustering. We find that the colors of the points almost coincide with the computed co-clusters.
  • ...and 1 more figures