Improving Sequential Recommendations with LLMs

Artun Boz; Wouter Zorgdrager; Zoe Kotti; Jesse Harte; Panos Louridas; Dietmar Jannach; Vassilios Karakoidas; Marios Fragkoulis

Improving Sequential Recommendations with LLMs

Artun Boz, Wouter Zorgdrager, Zoe Kotti, Jesse Harte, Panos Louridas, Dietmar Jannach, Vassilios Karakoidas, Marios Fragkoulis

TL;DR

This work investigates how Large Language Models (LLMs) can enhance sequential recommendations by proposing three orthogonal approaches: semantic item embeddings via LLMs (LLMSeqSim), prompt-based fine-tuning of LLMs for next-item generation/ranking (LLMSeqPrompt), and LLM-enhanced sequential models that inject LLM-derived item representations (LLM2Sequential), plus hybrids. Across three datasets (Amazon Beauty, Delivery Hero, Steam), the study shows substantial accuracy gains when incorporating LLM embeddings into existing sequential models, with LLM2Sequential variants often achieving the largest improvements in NDCG@20 and notable boosts in catalog coverage and serendipity. Fine-tuning LLMs (GPT and PaLM) yields strong performance gains across several tasks, with GPT generally outperforming PaLM in the reported setups and reducing hallucinations. The results highlight the practical value of leveraging semantic knowledge from LLMs to enhance sequential recommendations and provide a comprehensive, reproducible evaluation of multiple LLM-based strategies and hybrids for real-world deployment.

Abstract

The sequential recommendation problem has attracted considerable research attention in the past few years, leading to the rise of numerous recommendation models. In this work, we explore how Large Language Models (LLMs), which are nowadays introducing disruptive effects in many AI-based applications, can be used to build or improve sequential recommendation approaches. Specifically, we design three orthogonal approaches and hybrids of those to leverage the power of LLMs in different ways. In addition, we investigate the potential of each approach by focusing on its comprising technical aspects and determining an array of alternative choices for each one. We conduct extensive experiments on three datasets and explore a large variety of configurations, including different language models and baseline recommendation models, to obtain a comprehensive picture of the performance of each approach. Among other observations, we highlight that initializing state-of-the-art sequential recommendation models such as BERT4Rec or SASRec with embeddings obtained from an LLM can lead to substantial performance gains in terms of accuracy. Furthermore, we find that fine-tuning an LLM for recommendation tasks enables it to learn not only the tasks, but also concepts of a domain to some extent. We also show that fine-tuning OpenAI GPT leads to considerably better performance than fine-tuning Google PaLM 2. Overall, our extensive experiments indicate a huge potential value of leveraging LLMs in future recommendation approaches. We publicly share the code and data of our experiments to ensure reproducibility.

Improving Sequential Recommendations with LLMs

TL;DR

Abstract

Paper Structure (43 sections, 2 equations, 12 figures, 7 tables)

This paper contains 43 sections, 2 equations, 12 figures, 7 tables.

Introduction
Background & Related Work
Cross-domain LLM-based Sequential Recommendation Models
Domain-specific LLM-based Sequential Recommendation Models
Multi-modal LLM-based Sequential Recommendation Models
Comparison to Previous Paper and Related Work
LLMSeqSim: Semantic Item Recommendations via LLM Embeddings
Embedding Sources
Dimensionality Reduction Methods
Number of Reduced Dimensions
Session Embedding Computation
LLMSeqPrompt: Prompt-based Recommendations by a Fine-Tuned LLM
Fine-tuning Task: Generate a Single Recommended Item
Fine-tuning Task: Generate a List of Item Recommendations
Fine-tuning Task: Classify Items for Next Item Recommendations
...and 28 more sections

Figures (12)

Figure 1: Layout of work. Each top-level branch of the tree shows one approach presented in the corresponding section. In brackets we indicate the alternatives we used.
Figure 2: Semantic item recommendations via LLM embeddings.
Figure 3: Next-item generation by fine-tuned LLM.
Figure 4: Recommendation list generation by fine-tuned LLM.
Figure 5: Multi-class classification of items by fine-tuned LLM.
...and 7 more figures

Improving Sequential Recommendations with LLMs

TL;DR

Abstract

Improving Sequential Recommendations with LLMs

Authors

TL;DR

Abstract

Table of Contents

Figures (12)