Table of Contents
Fetching ...

Language Representations Can be What Recommenders Need: Findings and Potentials

Leheng Sheng, An Zhang, Yi Zhang, Yuxin Chen, Xiang Wang, Tat-Seng Chua

TL;DR

This work interrogates whether language model representations inherently encode user preferences for recommendation. It demonstrates that a simple linear mapping from language representations of item titles into a behavior space can yield strong, sometimes state-of-the-art recommendations, suggesting a homomorphism between language and behavior spaces. Building on this insight, the authors introduce AlphaRec, a language-representation–based collaborative filtering model that uses a frozen LM for item text and a small nonlinear projector with graph convolution, trained with a contrastive loss, and show it outperforms ID-based baselines across diverse datasets, including zero-shot and intention-aware settings. Additionally, the paper highlights potentials of language representations for initialization, generalization, and intention-aware customization, underscoring a promising bridge between natural language understanding and user-behavior modeling for recommender systems.

Abstract

Recent studies empirically indicate that language models (LMs) encode rich world knowledge beyond mere semantics, attracting significant attention across various fields. However, in the recommendation domain, it remains uncertain whether LMs implicitly encode user preference information. Contrary to prevailing understanding that LMs and traditional recommenders learn two distinct representation spaces due to the huge gap in language and behavior modeling objectives, this work re-examines such understanding and explores extracting a recommendation space directly from the language representation space. Surprisingly, our findings demonstrate that item representations, when linearly mapped from advanced LM representations, yield superior recommendation performance. This outcome suggests the possible homomorphism between the advanced language representation space and an effective item representation space for recommendation, implying that collaborative signals may be implicitly encoded within LMs. Motivated by these findings, we explore the possibility of designing advanced collaborative filtering (CF) models purely based on language representations without ID-based embeddings. To be specific, we incorporate several crucial components to build a simple yet effective model, with item titles as the input. Empirical results show that such a simple model can outperform leading ID-based CF models, which sheds light on using language representations for better recommendation. Moreover, we systematically analyze this simple model and find several key features for using advanced language representations: a good initialization for item representations, zero-shot recommendation abilities, and being aware of user intention. Our findings highlight the connection between language modeling and behavior modeling, which can inspire both natural language processing and recommender system communities.

Language Representations Can be What Recommenders Need: Findings and Potentials

TL;DR

This work interrogates whether language model representations inherently encode user preferences for recommendation. It demonstrates that a simple linear mapping from language representations of item titles into a behavior space can yield strong, sometimes state-of-the-art recommendations, suggesting a homomorphism between language and behavior spaces. Building on this insight, the authors introduce AlphaRec, a language-representation–based collaborative filtering model that uses a frozen LM for item text and a small nonlinear projector with graph convolution, trained with a contrastive loss, and show it outperforms ID-based baselines across diverse datasets, including zero-shot and intention-aware settings. Additionally, the paper highlights potentials of language representations for initialization, generalization, and intention-aware customization, underscoring a promising bridge between natural language understanding and user-behavior modeling for recommender systems.

Abstract

Recent studies empirically indicate that language models (LMs) encode rich world knowledge beyond mere semantics, attracting significant attention across various fields. However, in the recommendation domain, it remains uncertain whether LMs implicitly encode user preference information. Contrary to prevailing understanding that LMs and traditional recommenders learn two distinct representation spaces due to the huge gap in language and behavior modeling objectives, this work re-examines such understanding and explores extracting a recommendation space directly from the language representation space. Surprisingly, our findings demonstrate that item representations, when linearly mapped from advanced LM representations, yield superior recommendation performance. This outcome suggests the possible homomorphism between the advanced language representation space and an effective item representation space for recommendation, implying that collaborative signals may be implicitly encoded within LMs. Motivated by these findings, we explore the possibility of designing advanced collaborative filtering (CF) models purely based on language representations without ID-based embeddings. To be specific, we incorporate several crucial components to build a simple yet effective model, with item titles as the input. Empirical results show that such a simple model can outperform leading ID-based CF models, which sheds light on using language representations for better recommendation. Moreover, we systematically analyze this simple model and find several key features for using advanced language representations: a good initialization for item representations, zero-shot recommendation abilities, and being aware of user intention. Our findings highlight the connection between language modeling and behavior modeling, which can inspire both natural language processing and recommender system communities.
Paper Structure (42 sections, 4 equations, 11 figures, 16 tables)

This paper contains 42 sections, 4 equations, 11 figures, 16 tables.

Figures (11)

  • Figure 1: Linearly mapping item titles in language representation space into behavior space yields superior recommendation performance on Movies & TV Amazon dataset. (\ref{['fig:linear-mapping']}) The framework of linear mapping. (\ref{['fig:linear-performance']}) The recommendation performance comparison between leading CF recommenders and linear mapping. (\ref{['fig:tsne-intro']}) The t-SNE tsne visualizations of movie representations, with colored lines linking identical movies or user intention across language space (left) and linearly projected behavior space for recommendation (right).
  • Figure 2: The recommendation performance of linear mapping with different language model sizes.
  • Figure 3: (\ref{['fig:ablation_book']}) The effect of each component on Books dataset. (\ref{['fig:efficiency']}) The number of epochs needed for each model to converge. AlphaRec exhibits a breakneck convergence speed.
  • Figure 4: User intention capture experiments on MovieLens-1M. (\ref{['fig:case-intention']}) AlphaRec refines the recommendations according to language-based user intention. (\ref{['fig:alpha-ml']}) The effect of user intention strength $\alpha$.
  • Figure 5: Example of item titles.
  • ...and 6 more figures