Improving Content Recommendation: Knowledge Graph-Based Semantic Contrastive Learning for Diversity and Cold-Start Users

Yejin Kim; Scott Rome; Kevin Foley; Mayur Nankani; Rimon Melamed; Javier Morales; Abhay Yadav; Maria Peifer; Sardar Hamidian; H. Howie Huang

Improving Content Recommendation: Knowledge Graph-Based Semantic Contrastive Learning for Diversity and Cold-Start Users

Yejin Kim, Scott Rome, Kevin Foley, Mayur Nankani, Rimon Melamed, Javier Morales, Abhay Yadav, Maria Peifer, Sardar Hamidian, H. Howie Huang

TL;DR

The findings demonstrate that jointly training user-item interactions and item-based signals using synopsis text is highly effective and provides evidence that item-based contrastive learning enhances the quality of entity embeddings, as indicated by metrics such as uniformity and alignment.

Abstract

Addressing the challenges related to data sparsity, cold-start problems, and diversity in recommendation systems is both crucial and demanding. Many current solutions leverage knowledge graphs to tackle these issues by combining both item-based and user-item collaborative signals. A common trend in these approaches focuses on improving ranking performance at the cost of escalating model complexity, reducing diversity, and complicating the task. It is essential to provide recommendations that are both personalized and diverse, rather than solely relying on achieving high rank-based performance, such as Click-through Rate, Recall, etc. In this paper, we propose a hybrid multi-task learning approach, training on user-item and item-item interactions. We apply item-based contrastive learning on descriptive text, sampling positive and negative pairs based on item metadata. Our approach allows the model to better understand the relationships between entities within the knowledge graph by utilizing semantic information from text. It leads to more accurate, relevant, and diverse user recommendations and a benefit that extends even to cold-start users who have few interactions with items. We perform extensive experiments on two widely used datasets to validate the effectiveness of our approach. Our findings demonstrate that jointly training user-item interactions and item-based signals using synopsis text is highly effective. Furthermore, our results provide evidence that item-based contrastive learning enhances the quality of entity embeddings, as indicated by metrics such as uniformity and alignment.

Improving Content Recommendation: Knowledge Graph-Based Semantic Contrastive Learning for Diversity and Cold-Start Users

TL;DR

Abstract

Paper Structure (21 sections, 7 equations, 6 figures, 5 tables)

This paper contains 21 sections, 7 equations, 6 figures, 5 tables.

Introduction
Related Works
Pre-trained Language Model
PLMs in Recommendation
Contrastive Learning
CL in Recommendation
Proposed Methods
Sampling Strategy
Content-based Contrastive Loss
Training
Inference
Experiments
Experimental Setup
Dataset
Encoders
...and 6 more sections

Figures (6)

Figure 1: Performance Decline in Cold-Start Scenario. In order to establish a baseline for comparing our model, we take into account the KGCN wang2019knowledge1, a pivotal component of KG recommendation models. 'Cold-start users' refers to the subgroup of test users positioned in the bottom 1%, signifying those with the most limited user-content interactions.
Figure 2: Overview of the Proposed Model. (a) Illustration of a knowledge graph structure and a model with multiple objectives. The loss from user-content interactions is labeled $L_{base}$, and the content-based contrastive loss is $CL$. (b) Detailed process of our proposed objective function $CL$. To generate initial content node embeddings, a pre-trained language model encodes the synopsis of each content. The positive and negative pairs are selected for each content based on their genre or title metadata (as outlined in Section \ref{['encoder']}).
Figure 3: Sampling Positive/Negative Pairs Using Cross-encoder for $CL$
Figure 4: Inference Using the Trained Contents and Users Embedding. The process $dot$ signifies the inner product operation, while $\sigma$ denotes the sigmoid transformation. Each user embedding, indexed by $n$, undergoes an inner product operation with each content embedding, indexed by $m$, followed by applying the sigmoid transformation. This procedure enables the model to rank all content for each user.
Figure 5: Comparing Model Performance Across a Spectrum of User Activity Levels. The distribution on the x-axis is based on the number of user-content interactions of each test user. For example, 1% refers to the test users with the fewest user-content interactions.
...and 1 more figures

Improving Content Recommendation: Knowledge Graph-Based Semantic Contrastive Learning for Diversity and Cold-Start Users

TL;DR

Abstract

Improving Content Recommendation: Knowledge Graph-Based Semantic Contrastive Learning for Diversity and Cold-Start Users

Authors

TL;DR

Abstract

Table of Contents

Figures (6)