Table of Contents
Fetching ...

Enhancing Sequential Recommendation with World Knowledge from Large Language Models

Tianjie Dai, Xu Chen, Yunmeng Shu, Jinsong Lan, Xiaoyong Zhu, Jiangchao Yao, Bo Zheng

TL;DR

GRASP tackles the challenge of enriching sequential recommendations with world knowledge from LLMs while guarding against hallucinations. It combines generation augmented retrieval to construct semantic embeddings of users and items with a holistic multi-level attention mechanism that uses retrieved similar users/items as context without forcing supervision. The approach is orthogonal to backbones and yields consistent, state-of-the-art gains across three benchmarks, including an industry dataset, with practical deployment considerations such as offline LLM embeddings and limited online HA computation. Empirically, GRASP improves ranking metrics and demonstrates robustness in long-tail scenarios, translating to measurable online gains in a real e-commerce setting.

Abstract

Sequential Recommendation System~(SRS) has become pivotal in modern society, which predicts subsequent actions based on the user's historical behavior. However, traditional collaborative filtering-based sequential recommendation models often lead to suboptimal performance due to the limited information of their collaborative signals. With the rapid development of LLMs, an increasing number of works have incorporated LLMs' world knowledge into sequential recommendation. Although they achieve considerable gains, these approaches typically assume the correctness of LLM-generated results and remain susceptible to noise induced by LLM hallucinations. To overcome these limitations, we propose GRASP (Generation Augmented Retrieval with Holistic Attention for Sequential Prediction), a flexible framework that integrates generation augmented retrieval for descriptive synthesis and similarity retrieval, and holistic attention enhancement which employs multi-level attention to effectively employ LLM's world knowledge even with hallucinations and better capture users' dynamic interests. The retrieved similar users/items serve as auxiliary contextual information for the later holistic attention enhancement module, effectively mitigating the noisy guidance of supervision-based methods. Comprehensive evaluations on two public benchmarks and one industrial dataset reveal that GRASP consistently achieves state-of-the-art performance when integrated with diverse backbones. The code is available at: https://anonymous.4open.science/r/GRASP-SRS.

Enhancing Sequential Recommendation with World Knowledge from Large Language Models

TL;DR

GRASP tackles the challenge of enriching sequential recommendations with world knowledge from LLMs while guarding against hallucinations. It combines generation augmented retrieval to construct semantic embeddings of users and items with a holistic multi-level attention mechanism that uses retrieved similar users/items as context without forcing supervision. The approach is orthogonal to backbones and yields consistent, state-of-the-art gains across three benchmarks, including an industry dataset, with practical deployment considerations such as offline LLM embeddings and limited online HA computation. Empirically, GRASP improves ranking metrics and demonstrates robustness in long-tail scenarios, translating to measurable online gains in a real e-commerce setting.

Abstract

Sequential Recommendation System~(SRS) has become pivotal in modern society, which predicts subsequent actions based on the user's historical behavior. However, traditional collaborative filtering-based sequential recommendation models often lead to suboptimal performance due to the limited information of their collaborative signals. With the rapid development of LLMs, an increasing number of works have incorporated LLMs' world knowledge into sequential recommendation. Although they achieve considerable gains, these approaches typically assume the correctness of LLM-generated results and remain susceptible to noise induced by LLM hallucinations. To overcome these limitations, we propose GRASP (Generation Augmented Retrieval with Holistic Attention for Sequential Prediction), a flexible framework that integrates generation augmented retrieval for descriptive synthesis and similarity retrieval, and holistic attention enhancement which employs multi-level attention to effectively employ LLM's world knowledge even with hallucinations and better capture users' dynamic interests. The retrieved similar users/items serve as auxiliary contextual information for the later holistic attention enhancement module, effectively mitigating the noisy guidance of supervision-based methods. Comprehensive evaluations on two public benchmarks and one industrial dataset reveal that GRASP consistently achieves state-of-the-art performance when integrated with diverse backbones. The code is available at: https://anonymous.4open.science/r/GRASP-SRS.

Paper Structure

This paper contains 19 sections, 6 equations, 4 figures, 4 tables, 1 algorithm.

Figures (4)

  • Figure 1: (a) GRU4Rec over-emphasizes frequent interactions while underrepresenting diverse or contextually related user intents, which was trained on the Industry-100K dataset collected from an e-commerce platform. (b) An example demonstrating LLM hallucinations, where generated content deviates from reality. LLM erroneously categorizes the consumption habits of this user with moderate purchasing power and a focus on health and wellness as opting for luxury goods. (c) Sequence length distribution of Industry-100K and hallucination rate analysis across different groups based on the user's interaction sequence length, where we can observe a significant increase in hallucinations as sequence length decreases.
  • Figure 2: The overview of GRASP framework, which is flexible on top of sequential recommendation baselines, and consists of generation augmented retrieval and holistic attention enhancement. The pipeline demonstrates the workflow and roles of each module within GRASP.
  • Figure 3: Examples of LLM hallucinations in Industry-100K and comparison of recommendation results from different models.
  • Figure 4: Analysis of hyper-parameters on Beauty dataset of GRASP based on SASRec. Left: $N$ is the size of the candidate pool for similar retrieval. Right: $d$ is the hidden dimension for SRS.