Table of Contents
Fetching ...

Exploring User Retrieval Integration towards Large Language Models for Cross-Domain Sequential Recommendation

Tingjia Shen, Hao Wang, Jiaqing Zhang, Sirui Zhao, Liangyue Li, Zulong Chen, Defu Lian, Enhong Chen

TL;DR

This work tackles Cross-Domain Sequential Recommendation (CDSR) under cold-start by marrying a dual-graph sequence model with a retrieval-augmented Large Language Model (LLM). It introduces URLLM, which uses a graph-based representation to capture collaborative and semantic signals, a KNN-based user retriever to fetch domain-relevant peers, and a domain-grounding refinement loop to ensure generation stays within target domains. Empirical results on Amazon Movie-Game and Art-Office datasets show URLLM consistently outperforms traditional CDSR and prior LLM-based methods, with strong improvements in cold-start scenarios. The approach demonstrates the value of integrating structured information and language-based reasoning for cross-domain recommendations and provides code for reproducibility.

Abstract

Cross-Domain Sequential Recommendation (CDSR) aims to mine and transfer users' sequential preferences across different domains to alleviate the long-standing cold-start issue. Traditional CDSR models capture collaborative information through user and item modeling while overlooking valuable semantic information. Recently, Large Language Model (LLM) has demonstrated powerful semantic reasoning capabilities, motivating us to introduce them to better capture semantic information. However, introducing LLMs to CDSR is non-trivial due to two crucial issues: seamless information integration and domain-specific generation. To this end, we propose a novel framework named URLLM, which aims to improve the CDSR performance by exploring the User Retrieval approach and domain grounding on LLM simultaneously. Specifically, we first present a novel dual-graph sequential model to capture the diverse information, along with an alignment and contrastive learning method to facilitate domain knowledge transfer. Subsequently, a user retrieve-generation model is adopted to seamlessly integrate the structural information into LLM, fully harnessing its emergent inferencing ability. Furthermore, we propose a domain-specific strategy and a refinement module to prevent out-of-domain generation. Extensive experiments on Amazon demonstrated the information integration and domain-specific generation ability of URLLM in comparison to state-of-the-art baselines. Our code is available at https://github.com/TingJShen/URLLM

Exploring User Retrieval Integration towards Large Language Models for Cross-Domain Sequential Recommendation

TL;DR

This work tackles Cross-Domain Sequential Recommendation (CDSR) under cold-start by marrying a dual-graph sequence model with a retrieval-augmented Large Language Model (LLM). It introduces URLLM, which uses a graph-based representation to capture collaborative and semantic signals, a KNN-based user retriever to fetch domain-relevant peers, and a domain-grounding refinement loop to ensure generation stays within target domains. Empirical results on Amazon Movie-Game and Art-Office datasets show URLLM consistently outperforms traditional CDSR and prior LLM-based methods, with strong improvements in cold-start scenarios. The approach demonstrates the value of integrating structured information and language-based reasoning for cross-domain recommendations and provides code for reproducibility.

Abstract

Cross-Domain Sequential Recommendation (CDSR) aims to mine and transfer users' sequential preferences across different domains to alleviate the long-standing cold-start issue. Traditional CDSR models capture collaborative information through user and item modeling while overlooking valuable semantic information. Recently, Large Language Model (LLM) has demonstrated powerful semantic reasoning capabilities, motivating us to introduce them to better capture semantic information. However, introducing LLMs to CDSR is non-trivial due to two crucial issues: seamless information integration and domain-specific generation. To this end, we propose a novel framework named URLLM, which aims to improve the CDSR performance by exploring the User Retrieval approach and domain grounding on LLM simultaneously. Specifically, we first present a novel dual-graph sequential model to capture the diverse information, along with an alignment and contrastive learning method to facilitate domain knowledge transfer. Subsequently, a user retrieve-generation model is adopted to seamlessly integrate the structural information into LLM, fully harnessing its emergent inferencing ability. Furthermore, we propose a domain-specific strategy and a refinement module to prevent out-of-domain generation. Extensive experiments on Amazon demonstrated the information integration and domain-specific generation ability of URLLM in comparison to state-of-the-art baselines. Our code is available at https://github.com/TingJShen/URLLM
Paper Structure (32 sections, 15 equations, 5 figures, 7 tables, 1 algorithm)

This paper contains 32 sections, 15 equations, 5 figures, 7 tables, 1 algorithm.

Figures (5)

  • Figure 1: An illustration of cold-start CDSA task along with the various forms of information, the domain-specific demand on information and generation. The line linking attributes represent the structural-semantic information.
  • Figure 2: The overall framework of URLLM. The component on the left showcases exemplary prompts employed in graph construction and the similar user-augmented LLM module. On the right, the process is delineated, wherein the expansive reasoning and few-shot analogy capabilities of the LLM are harnessed, concomitantly integrating structured knowledge.
  • Figure 3: The example case illustrates the importance of LLM inferencing and similar user retrieval.
  • Figure 4: Performance comparison is conducted in warm (left) and cold (right) scenarios for Movie-Game and Art-Office. "UC" denotes the substitution of the retrieval model of URLLM with $\bm{C^2DSR}$, and "w/o R" designates URLLM without user retrieval.
  • Figure 5: The positive correlation between quality of retrieved user and performance of model.