Table of Contents
Fetching ...

Text-like Encoding of Collaborative Information in Large Language Models for Recommendation

Yang Zhang, Keqin Bao, Ming Yan, Wenjie Wang, Fuli Feng, Xiangnan He

TL;DR

This paper tackles the problem of effectively integrating collaborative information into large language model–based recommender systems (LLMRec). It introduces BinLLM, a novel approach that encodes collaborative embeddings as text-like binary sequences, with optional dot-decimal compression, to align with LLMs’ textual processing. The method comprises a text-like encoding module and a LoRA-tuned LLM predictor, trained via a two-stage process: pre-training of the binary encoding and subsequent LoRA tuning. Empirical results on MovieLens-1M and Amazon-Book show BinLLM achieving state-of-the-art or competitive performance across AUC-based metrics, with robust warm and cold-start behavior and favorable ablations, while compression reduces sequence length with minimal accuracy loss. The work demonstrates that text-like encoding can better harmonize collaborative information with LLMs, enabling efficient and effective LLMRec with potential for broader tasks and accelerated inference in future work.

Abstract

When adapting Large Language Models for Recommendation (LLMRec), it is crucial to integrate collaborative information. Existing methods achieve this by learning collaborative embeddings in LLMs' latent space from scratch or by mapping from external models. However, they fail to represent the information in a text-like format, which may not align optimally with LLMs. To bridge this gap, we introduce BinLLM, a novel LLMRec method that seamlessly integrates collaborative information through text-like encoding. BinLLM converts collaborative embeddings from external models into binary sequences -- a specific text format that LLMs can understand and operate on directly, facilitating the direct usage of collaborative information in text-like format by LLMs. Additionally, BinLLM provides options to compress the binary sequence using dot-decimal notation to avoid excessively long lengths. Extensive experiments validate that BinLLM introduces collaborative information in a manner better aligned with LLMs, resulting in enhanced performance. We release our code at https://github.com/zyang1580/BinLLM.

Text-like Encoding of Collaborative Information in Large Language Models for Recommendation

TL;DR

This paper tackles the problem of effectively integrating collaborative information into large language model–based recommender systems (LLMRec). It introduces BinLLM, a novel approach that encodes collaborative embeddings as text-like binary sequences, with optional dot-decimal compression, to align with LLMs’ textual processing. The method comprises a text-like encoding module and a LoRA-tuned LLM predictor, trained via a two-stage process: pre-training of the binary encoding and subsequent LoRA tuning. Empirical results on MovieLens-1M and Amazon-Book show BinLLM achieving state-of-the-art or competitive performance across AUC-based metrics, with robust warm and cold-start behavior and favorable ablations, while compression reduces sequence length with minimal accuracy loss. The work demonstrates that text-like encoding can better harmonize collaborative information with LLMs, enabling efficient and effective LLMRec with potential for broader tasks and accelerated inference in future work.

Abstract

When adapting Large Language Models for Recommendation (LLMRec), it is crucial to integrate collaborative information. Existing methods achieve this by learning collaborative embeddings in LLMs' latent space from scratch or by mapping from external models. However, they fail to represent the information in a text-like format, which may not align optimally with LLMs. To bridge this gap, we introduce BinLLM, a novel LLMRec method that seamlessly integrates collaborative information through text-like encoding. BinLLM converts collaborative embeddings from external models into binary sequences -- a specific text format that LLMs can understand and operate on directly, facilitating the direct usage of collaborative information in text-like format by LLMs. Additionally, BinLLM provides options to compress the binary sequence using dot-decimal notation to avoid excessively long lengths. Extensive experiments validate that BinLLM introduces collaborative information in a manner better aligned with LLMs, resulting in enhanced performance. We release our code at https://github.com/zyang1580/BinLLM.
Paper Structure (22 sections, 6 equations, 3 figures, 4 tables)

This paper contains 22 sections, 6 equations, 3 figures, 4 tables.

Figures (3)

  • Figure 1: Model architecture overview of our BinLLM. The purple line is used to fill the text fields in the prompt template, introducing textual information like item titles, while the red line is used to fill the ID fields in the prompt template, introducing collaborative information.
  • Figure 2: Performance comparison in warm and cold scenarios on ML-1M and Amazon-Book. The left y-axis represents AUC, while the right one represents UAUC.
  • Figure 3: Performance of BinLLM with (w comp.) and without compression (w/o comp.). The left y-axis represents AUC, while the right one represents UAUC.