Table of Contents
Fetching ...

Adapting Large Language Models by Integrating Collaborative Semantics for Recommendation

Bowen Zheng, Yupeng Hou, Hongyu Lu, Yu Chen, Wayne Xin Zhao, Ming Chen, Ji-Rong Wen

TL;DR

This work tackles the gap between language semantics in LLMs and collaborative semantics in recommender systems by introducing LC-Rec, which learns discrete, tree-structured item indices via a Residual-Quantized VAE with uniform semantic mapping. It then trains LLMs with a suite of alignment tasks—sequential item prediction, explicit index-language alignment, and implicit recommendation-oriented alignment—to fuse language and collaborative information for end-to-end item generation over the full item set. Empirical results across three real-world Amazon subsets show LC-Rec achieving state-of-the-art full-ranking performance, with significant gains over strong baselines and insightful ablations highlighting the value of the indexing and tuning strategies. The approach enables autoregressive, candidate-free recommendation while maintaining semantic coherence and offering practical efficiency improvements, suggesting strong potential for scalable, language-enhanced recommender systems and future multi-turn chat extensions.

Abstract

Recently, large language models (LLMs) have shown great potential in recommender systems, either improving existing recommendation models or serving as the backbone. However, there exists a large semantic gap between LLMs and recommender systems, since items to be recommended are often indexed by discrete identifiers (item ID) out of the LLM's vocabulary. In essence, LLMs capture language semantics while recommender systems imply collaborative semantics, making it difficult to sufficiently leverage the model capacity of LLMs for recommendation. To address this challenge, in this paper, we propose a new LLM-based recommendation model called LC-Rec, which can better integrate language and collaborative semantics for recommender systems. Our approach can directly generate items from the entire item set for recommendation, without relying on candidate items. Specifically, we make two major contributions in our approach. For item indexing, we design a learning-based vector quantization method with uniform semantic mapping, which can assign meaningful and non-conflicting IDs (called item indices) for items. For alignment tuning, we propose a series of specially designed tuning tasks to enhance the integration of collaborative semantics in LLMs. Our fine-tuning tasks enforce LLMs to deeply integrate language and collaborative semantics (characterized by the learned item indices), so as to achieve an effective adaptation to recommender systems. Extensive experiments demonstrate the effectiveness of our method, showing that our approach can outperform a number of competitive baselines including traditional recommenders and existing LLM-based recommenders. Our code is available at https://github.com/RUCAIBox/LC-Rec/.

Adapting Large Language Models by Integrating Collaborative Semantics for Recommendation

TL;DR

This work tackles the gap between language semantics in LLMs and collaborative semantics in recommender systems by introducing LC-Rec, which learns discrete, tree-structured item indices via a Residual-Quantized VAE with uniform semantic mapping. It then trains LLMs with a suite of alignment tasks—sequential item prediction, explicit index-language alignment, and implicit recommendation-oriented alignment—to fuse language and collaborative information for end-to-end item generation over the full item set. Empirical results across three real-world Amazon subsets show LC-Rec achieving state-of-the-art full-ranking performance, with significant gains over strong baselines and insightful ablations highlighting the value of the indexing and tuning strategies. The approach enables autoregressive, candidate-free recommendation while maintaining semantic coherence and offering practical efficiency improvements, suggesting strong potential for scalable, language-enhanced recommender systems and future multi-turn chat extensions.

Abstract

Recently, large language models (LLMs) have shown great potential in recommender systems, either improving existing recommendation models or serving as the backbone. However, there exists a large semantic gap between LLMs and recommender systems, since items to be recommended are often indexed by discrete identifiers (item ID) out of the LLM's vocabulary. In essence, LLMs capture language semantics while recommender systems imply collaborative semantics, making it difficult to sufficiently leverage the model capacity of LLMs for recommendation. To address this challenge, in this paper, we propose a new LLM-based recommendation model called LC-Rec, which can better integrate language and collaborative semantics for recommender systems. Our approach can directly generate items from the entire item set for recommendation, without relying on candidate items. Specifically, we make two major contributions in our approach. For item indexing, we design a learning-based vector quantization method with uniform semantic mapping, which can assign meaningful and non-conflicting IDs (called item indices) for items. For alignment tuning, we propose a series of specially designed tuning tasks to enhance the integration of collaborative semantics in LLMs. Our fine-tuning tasks enforce LLMs to deeply integrate language and collaborative semantics (characterized by the learned item indices), so as to achieve an effective adaptation to recommender systems. Extensive experiments demonstrate the effectiveness of our method, showing that our approach can outperform a number of competitive baselines including traditional recommenders and existing LLM-based recommenders. Our code is available at https://github.com/RUCAIBox/LC-Rec/.
Paper Structure (36 sections, 4 equations, 6 figures, 5 tables, 1 algorithm)

This paper contains 36 sections, 4 equations, 6 figures, 5 tables, 1 algorithm.

Figures (6)

  • Figure 1: The overall framework of our LC-Rec. We enhance language learning models (LLMs) by integrating language and collaborative semantics based on item indexing and alignment tuning, thereby adapting LLMs to recommender systems.
  • Figure 2: The performance of our framework on three indexing methods, we report HR@5 and NDCG@5 on Games dataset. "SEQ" denotes fine-tuning only with the sequential item prediction task. "w/ ALIGN" denotes combining with our semantic alignment tasks.
  • Figure 3: Performance of item prediction based on user intention.
  • Figure 4: 2D visualization of LLM token embeddings via PCA.
  • Figure 5: Case study about the semantics within item indices. For the cases in Figure \ref{['fig:case:gen']}, it can be observed that as the number of index increases, the generated content progressively converges towards the target title, and the semantic changes show a trend from coarse to fine. For the cases in Figure \ref{['fig:case:sim']}, compared to those based solely on language semantics, related items generated using item indices that integrate both language and collaborative semantics are more suitable for recommendation scenarios.
  • ...and 1 more figures