Table of Contents
Fetching ...

Towards a Unified Paradigm: Integrating Recommendation Systems as a New Language in Large Models

Kai Zheng, Qingfeng Sun, Can Xu, Peng Yu, Qingwei Guo

TL;DR

The paper proposes RSLLM, a unified paradigm that integrates traditional sequential recommender signals as a language for large language models. It introduces unified prompting that blends ID-based item embeddings with textual item features via a Text-ID prompting and a Hybrid Encoding with a trainable projector, enabling the LLM to process both semantic and collaborative information. A two-stage fine-tuning framework, combining text-only pretraining and target-domain fine-tuning with contrastive alignment (Next Item Prediction plus two InfoNCE-like losses), trains the model to fuse behavioral knowledge from recommenders into the LLM. Empirical results across MovieLens, Steam, and LastFM show RSLLM outperforms both traditional and other LLM-based sequence recommenders, with high validity and strong ablation evidence supporting the importance of multimodal item representations, cross-domain alignment, and staged training. This work suggests a practical path toward more accurate, knowledge-rich, and instruction-following sequential recommendations in large-scale language models, albeit with notable computational demands.

Abstract

This paper explores the use of Large Language Models (LLMs) for sequential recommendation, which predicts users' future interactions based on their past behavior. We introduce a new concept, "Integrating Recommendation Systems as a New Language in Large Models" (RSLLM), which combines the strengths of traditional recommenders and LLMs. RSLLM uses a unique prompting method that combines ID-based item embeddings from conventional recommendation models with textual item features. It treats users' sequential behaviors as a distinct language and aligns the ID embeddings with the LLM's input space using a projector. We also propose a two-stage LLM fine-tuning framework that refines a pretrained LLM using a combination of two contrastive losses and a language modeling loss. The LLM is first fine-tuned using text-only prompts, followed by target domain fine-tuning with unified prompts. This trains the model to incorporate behavioral knowledge from the traditional sequential recommender into the LLM. Our empirical results validate the effectiveness of our proposed framework.

Towards a Unified Paradigm: Integrating Recommendation Systems as a New Language in Large Models

TL;DR

The paper proposes RSLLM, a unified paradigm that integrates traditional sequential recommender signals as a language for large language models. It introduces unified prompting that blends ID-based item embeddings with textual item features via a Text-ID prompting and a Hybrid Encoding with a trainable projector, enabling the LLM to process both semantic and collaborative information. A two-stage fine-tuning framework, combining text-only pretraining and target-domain fine-tuning with contrastive alignment (Next Item Prediction plus two InfoNCE-like losses), trains the model to fuse behavioral knowledge from recommenders into the LLM. Empirical results across MovieLens, Steam, and LastFM show RSLLM outperforms both traditional and other LLM-based sequence recommenders, with high validity and strong ablation evidence supporting the importance of multimodal item representations, cross-domain alignment, and staged training. This work suggests a practical path toward more accurate, knowledge-rich, and instruction-following sequential recommendations in large-scale language models, albeit with notable computational demands.

Abstract

This paper explores the use of Large Language Models (LLMs) for sequential recommendation, which predicts users' future interactions based on their past behavior. We introduce a new concept, "Integrating Recommendation Systems as a New Language in Large Models" (RSLLM), which combines the strengths of traditional recommenders and LLMs. RSLLM uses a unique prompting method that combines ID-based item embeddings from conventional recommendation models with textual item features. It treats users' sequential behaviors as a distinct language and aligns the ID embeddings with the LLM's input space using a projector. We also propose a two-stage LLM fine-tuning framework that refines a pretrained LLM using a combination of two contrastive losses and a language modeling loss. The LLM is first fine-tuned using text-only prompts, followed by target domain fine-tuning with unified prompts. This trains the model to incorporate behavioral knowledge from the traditional sequential recommender into the LLM. Our empirical results validate the effectiveness of our proposed framework.

Paper Structure

This paper contains 16 sections, 5 equations, 5 figures, 4 tables, 1 algorithm.

Figures (5)

  • Figure 1: An illustration of prior item representation methods and ours. (a) ID Number: represents an item with a numerical index. (b) Text Metadata: represents an item with its textual features, such as item title. (c) An illustration of our proposed RSLLM approach: integrates both textual tokens and behavioral tokens derived from the ID-based item embedding learned by traditional recommender models
  • Figure 2: Results of recommendation model efficiency analysis. We compare RSLLM with strong baselines with Caser backbone. The GRU4Rec and SASRec result are presented in Appendix \ref{['A:1']}.
  • Figure 3: The performance comparison of different item representation methods (i.e., numerical index, behavioral token, textual feature, LLaRA and RSLLM representation) in datasets: MovieLens, Steam and LastFM.
  • Figure 4: Results of recommendation model efficiency analysis. We compare RSLLM with strong baselines with GRU4Rec backbone.
  • Figure 5: Results of recommendation model efficiency analysis. We compare RSLLM with strong baselines with SASRec backbone.