Text-like Encoding of Collaborative Information in Large Language Models for Recommendation
Yang Zhang, Keqin Bao, Ming Yan, Wenjie Wang, Fuli Feng, Xiangnan He
TL;DR
This paper tackles the problem of effectively integrating collaborative information into large language model–based recommender systems (LLMRec). It introduces BinLLM, a novel approach that encodes collaborative embeddings as text-like binary sequences, with optional dot-decimal compression, to align with LLMs’ textual processing. The method comprises a text-like encoding module and a LoRA-tuned LLM predictor, trained via a two-stage process: pre-training of the binary encoding and subsequent LoRA tuning. Empirical results on MovieLens-1M and Amazon-Book show BinLLM achieving state-of-the-art or competitive performance across AUC-based metrics, with robust warm and cold-start behavior and favorable ablations, while compression reduces sequence length with minimal accuracy loss. The work demonstrates that text-like encoding can better harmonize collaborative information with LLMs, enabling efficient and effective LLMRec with potential for broader tasks and accelerated inference in future work.
Abstract
When adapting Large Language Models for Recommendation (LLMRec), it is crucial to integrate collaborative information. Existing methods achieve this by learning collaborative embeddings in LLMs' latent space from scratch or by mapping from external models. However, they fail to represent the information in a text-like format, which may not align optimally with LLMs. To bridge this gap, we introduce BinLLM, a novel LLMRec method that seamlessly integrates collaborative information through text-like encoding. BinLLM converts collaborative embeddings from external models into binary sequences -- a specific text format that LLMs can understand and operate on directly, facilitating the direct usage of collaborative information in text-like format by LLMs. Additionally, BinLLM provides options to compress the binary sequence using dot-decimal notation to avoid excessively long lengths. Extensive experiments validate that BinLLM introduces collaborative information in a manner better aligned with LLMs, resulting in enhanced performance. We release our code at https://github.com/zyang1580/BinLLM.
