Table of Contents
Fetching ...

Enhancing Recommender Systems Using Textual Embeddings from Pre-trained Language Models

Ngoc Luyen Le, Marie-Hélène Abel

TL;DR

The paper tackles the limited semantic understanding in traditional recommender systems by converting structured user, item, and context data into natural language representations and leveraging pre-trained language models (e.g., BERT, DistilBERT, RoBERTa) to produce rich textual embeddings. These embeddings are fused with conventional features through a five-layer data enrichment architecture (Input, Textual Embedding, Concatenation, Deep Recommender, Output) to feed deep learning-based recommenders. Across the MovieLens ML-1M dataset, PLM-based enrichment generally improves performance (AUC and LogLoss) compared with raw baselines, though gains depend on the RS model and PLM used. The results demonstrate the potential of PLMs to enhance personalization and context-awareness in recommendations, with future work focusing on scalability and domain-specific adaptations.

Abstract

Recent advancements in language models and pre-trained language models like BERT and RoBERTa have revolutionized natural language processing, enabling a deeper understanding of human-like language. In this paper, we explore enhancing recommender systems using textual embeddings from pre-trained language models to address the limitations of traditional recommender systems that rely solely on explicit features from users, items, and user-item interactions. By transforming structured data into natural language representations, we generate high-dimensional embeddings that capture deeper semantic relationships between users, items, and contexts. Our experiments demonstrate that this approach significantly improves recommendation accuracy and relevance, resulting in more personalized and context-aware recommendations. The findings underscore the potential of PLMs to enhance the effectiveness of recommender systems.

Enhancing Recommender Systems Using Textual Embeddings from Pre-trained Language Models

TL;DR

The paper tackles the limited semantic understanding in traditional recommender systems by converting structured user, item, and context data into natural language representations and leveraging pre-trained language models (e.g., BERT, DistilBERT, RoBERTa) to produce rich textual embeddings. These embeddings are fused with conventional features through a five-layer data enrichment architecture (Input, Textual Embedding, Concatenation, Deep Recommender, Output) to feed deep learning-based recommenders. Across the MovieLens ML-1M dataset, PLM-based enrichment generally improves performance (AUC and LogLoss) compared with raw baselines, though gains depend on the RS model and PLM used. The results demonstrate the potential of PLMs to enhance personalization and context-awareness in recommendations, with future work focusing on scalability and domain-specific adaptations.

Abstract

Recent advancements in language models and pre-trained language models like BERT and RoBERTa have revolutionized natural language processing, enabling a deeper understanding of human-like language. In this paper, we explore enhancing recommender systems using textual embeddings from pre-trained language models to address the limitations of traditional recommender systems that rely solely on explicit features from users, items, and user-item interactions. By transforming structured data into natural language representations, we generate high-dimensional embeddings that capture deeper semantic relationships between users, items, and contexts. Our experiments demonstrate that this approach significantly improves recommendation accuracy and relevance, resulting in more personalized and context-aware recommendations. The findings underscore the potential of PLMs to enhance the effectiveness of recommender systems.

Paper Structure

This paper contains 12 sections, 1 figure, 2 tables.

Figures (1)

  • Figure 1: The data enrichment architecture based on PLMs for RSs