Table of Contents
Fetching ...

One Model for All: Large Language Models are Domain-Agnostic Recommendation Systems

Zuoli Tang, Zhaoxin Huan, Zihao Li, Xiaolu Zhang, Jun Hu, Chilin Fu, Jun Zhou, Lixin Zou, Chenliang Li

TL;DR

This work tackles data sparsity and cold-start in sequential recommendation by proposing LLM-Rec, a domain-agnostic framework that uses pre-trained large language models to generate unified user and item representations from cross-domain textual features. By mixing user histories across domains and encoding item titles directly, LLM-Rec leverages world knowledge embedded in LLMs to bridge semantic gaps and improve recommendations across all domains. The paper conducts extensive experiments across five real-world domains with model sizes from 40M to 6.7B, analyzes cross-domain data effects, model size, tuning methods, and deployment costs, and provides insights into how semantic understanding and memory-like collaborative filtering contribute to performance—especially in cold-start and tail-item scenarios. The findings suggest that larger LLMs offer significant zero-shot benefits and tail improvements, but practical deployment requires careful consideration of training/inference costs and tuning strategies, with FPFT favored for smaller models and PEFT being more viable at larger scales.

Abstract

Sequential recommendation systems aim to predict users' next likely interaction based on their history. However, these systems face data sparsity and cold-start problems. Utilizing data from other domains, known as multi-domain methods, is useful for alleviating these problems. However, traditional multi-domain methods rely on meaningless ID-based item representation, which makes it difficult to align items with similar meanings from different domains, yielding sup-optimal knowledge transfer. This paper introduces LLM-Rec, a framework that utilizes pre-trained large language models (LLMs) for domain-agnostic recommendation. Specifically, we mix user's behaviors from multiple domains and concatenate item titles into a sentence, then use LLMs for generating user and item representations. By mixing behaviors across different domains, we can exploit the knowledge encoded in LLMs to bridge the semantic across over multi-domain behaviors, thus obtaining semantically rich representations and improving performance in all domains. Furthermore, we explore the underlying reasons why LLMs are effective and investigate whether LLMs can understand the semantic correlations as the recommendation model, and if advanced techniques like scaling laws in NLP also work in recommendations. We conduct extensive experiments with LLMs ranging from 40M to 6.7B to answer the above questions and to verify the effectiveness of LLM-Rec in multi-domain recommendation.

One Model for All: Large Language Models are Domain-Agnostic Recommendation Systems

TL;DR

This work tackles data sparsity and cold-start in sequential recommendation by proposing LLM-Rec, a domain-agnostic framework that uses pre-trained large language models to generate unified user and item representations from cross-domain textual features. By mixing user histories across domains and encoding item titles directly, LLM-Rec leverages world knowledge embedded in LLMs to bridge semantic gaps and improve recommendations across all domains. The paper conducts extensive experiments across five real-world domains with model sizes from 40M to 6.7B, analyzes cross-domain data effects, model size, tuning methods, and deployment costs, and provides insights into how semantic understanding and memory-like collaborative filtering contribute to performance—especially in cold-start and tail-item scenarios. The findings suggest that larger LLMs offer significant zero-shot benefits and tail improvements, but practical deployment requires careful consideration of training/inference costs and tuning strategies, with FPFT favored for smaller models and PEFT being more viable at larger scales.

Abstract

Sequential recommendation systems aim to predict users' next likely interaction based on their history. However, these systems face data sparsity and cold-start problems. Utilizing data from other domains, known as multi-domain methods, is useful for alleviating these problems. However, traditional multi-domain methods rely on meaningless ID-based item representation, which makes it difficult to align items with similar meanings from different domains, yielding sup-optimal knowledge transfer. This paper introduces LLM-Rec, a framework that utilizes pre-trained large language models (LLMs) for domain-agnostic recommendation. Specifically, we mix user's behaviors from multiple domains and concatenate item titles into a sentence, then use LLMs for generating user and item representations. By mixing behaviors across different domains, we can exploit the knowledge encoded in LLMs to bridge the semantic across over multi-domain behaviors, thus obtaining semantically rich representations and improving performance in all domains. Furthermore, we explore the underlying reasons why LLMs are effective and investigate whether LLMs can understand the semantic correlations as the recommendation model, and if advanced techniques like scaling laws in NLP also work in recommendations. We conduct extensive experiments with LLMs ranging from 40M to 6.7B to answer the above questions and to verify the effectiveness of LLM-Rec in multi-domain recommendation.
Paper Structure (42 sections, 6 equations, 15 figures, 7 tables)

This paper contains 42 sections, 6 equations, 15 figures, 7 tables.

Figures (15)

  • Figure 1: Users’ interests across various domains exhibit semantic correlations. A user who enjoys war-themed movies may also be interested in books related to World War II.
  • Figure 2: Applying single-domain ID-based model (SASRec) to multi-domain scenario.
  • Figure 3: The overview of the proposed LLM-Rec
  • Figure 4: Two different cross-domain settings. The left half describes the partitioning of different training sets and the right half describes the partitioning methods for different testing sets. It is worth noting that in the left part item represented by a green background is ignored in both Single Domain and S$2$PIAO settings due to only one occurrence in the sequence.
  • Figure 5: Performance under different cross-domain data settings. The pre-trained language model is BERT-110M.
  • ...and 10 more figures