Table of Contents
Fetching ...

Multi-view Intent Learning and Alignment with Large Language Models for Session-based Recommendation

Shutong Qiao, Wei Zhou, Junhao Wen, Chen Gao, Qun Luo, Peixuan Chen, Yong Li

TL;DR

This work proposes an LLM-enhanced SBR framework that integrates semantic and behavioral signals from multiple views, and leverages the strengths of both LLMs and traditional SBR models while minimizing training costs.

Abstract

Session-based recommendation (SBR) methods often rely on user behavior data, which can struggle with the sparsity of session data, limiting performance. Researchers have identified that beyond behavioral signals, rich semantic information in item descriptions is crucial for capturing hidden user intent. While large language models (LLMs) offer new ways to leverage this semantic data, the challenges of session anonymity, short-sequence nature, and high LLM training costs have hindered the development of a lightweight, efficient LLM framework for SBR. To address the above challenges, we propose an LLM-enhanced SBR framework that integrates semantic and behavioral signals from multiple views. This two-stage framework leverages the strengths of both LLMs and traditional SBR models while minimizing training costs. In the first stage, we use multi-view prompts to infer latent user intentions at the session semantic level, supported by an intent localization module to alleviate LLM hallucinations. In the second stage, we align and unify these semantic inferences with behavioral representations, effectively merging insights from both large and small models. Extensive experiments on two real datasets demonstrate that the LLM4SBR framework can effectively improve model performance. We release our codes along with the baselines at https://github.com/tsinghua-fib-lab/LLM4SBR.

Multi-view Intent Learning and Alignment with Large Language Models for Session-based Recommendation

TL;DR

This work proposes an LLM-enhanced SBR framework that integrates semantic and behavioral signals from multiple views, and leverages the strengths of both LLMs and traditional SBR models while minimizing training costs.

Abstract

Session-based recommendation (SBR) methods often rely on user behavior data, which can struggle with the sparsity of session data, limiting performance. Researchers have identified that beyond behavioral signals, rich semantic information in item descriptions is crucial for capturing hidden user intent. While large language models (LLMs) offer new ways to leverage this semantic data, the challenges of session anonymity, short-sequence nature, and high LLM training costs have hindered the development of a lightweight, efficient LLM framework for SBR. To address the above challenges, we propose an LLM-enhanced SBR framework that integrates semantic and behavioral signals from multiple views. This two-stage framework leverages the strengths of both LLMs and traditional SBR models while minimizing training costs. In the first stage, we use multi-view prompts to infer latent user intentions at the session semantic level, supported by an intent localization module to alleviate LLM hallucinations. In the second stage, we align and unify these semantic inferences with behavioral representations, effectively merging insights from both large and small models. Extensive experiments on two real datasets demonstrate that the LLM4SBR framework can effectively improve model performance. We release our codes along with the baselines at https://github.com/tsinghua-fib-lab/LLM4SBR.
Paper Structure (32 sections, 15 equations, 8 figures, 5 tables, 1 algorithm)

This paper contains 32 sections, 15 equations, 8 figures, 5 tables, 1 algorithm.

Figures (8)

  • Figure 1: LLM4SBR framework diagram. LLM4SBR is a two-stage framework: (a) In the intent inference stage, LLM makes initial inferences based on prompts from different views (long-term and short-term). Subsequently, the intent localization module is utilized to alleviate hallucinations and enhance semantics in the inference results. (b) In the representation enhancement stage, interaction data and text data are synchronously loaded into the model. Traditional SBR models are used to model the interaction data to obtain local and global session representations. After aligning and uniforming session representations and inference representations of the same view, all representations are fused into the final session representation for prediction.
  • Figure 2: Illustration of the design of prompts. LLM will perform inference based on short-term and long-term prompts respectively to obtain inference results from different views.
  • Figure 3: An example of LLM inference from different views.
  • Figure 4: The result of intent localization module.
  • Figure 5: Performance Comparison with SBR Models Combining LLMs.
  • ...and 3 more figures