Table of Contents
Fetching ...

Language-Based User Profiles for Recommendation

Joyce Zhou, Yijia Dai, Thorsten Joachims

TL;DR

The Language-based Factorization Model is proposed, which is essentially an encoder/decoder model where both the encoder and the decoder are large language models (LLMs) and generating a compact and human-readable summary often performs comparably with or better than direct LLM prediction, while enjoying better interpretability and shorter model input length.

Abstract

Most conventional recommendation methods (e.g., matrix factorization) represent user profiles as high-dimensional vectors. Unfortunately, these vectors lack interpretability and steerability, and often perform poorly in cold-start settings. To address these shortcomings, we explore the use of user profiles that are represented as human-readable text. We propose the Language-based Factorization Model (LFM), which is essentially an encoder/decoder model where both the encoder and the decoder are large language models (LLMs). The encoder LLM generates a compact natural-language profile of the user's interests from the user's rating history. The decoder LLM uses this summary profile to complete predictive downstream tasks. We evaluate our LFM approach on the MovieLens dataset, comparing it against matrix factorization and an LLM model that directly predicts from the user's rating history. In cold-start settings, we find that our method can have higher accuracy than matrix factorization. Furthermore, we find that generating a compact and human-readable summary often performs comparably with or better than direct LLM prediction, while enjoying better interpretability and shorter model input length. Our results motivate a number of future research directions and potential improvements.

Language-Based User Profiles for Recommendation

TL;DR

The Language-based Factorization Model is proposed, which is essentially an encoder/decoder model where both the encoder and the decoder are large language models (LLMs) and generating a compact and human-readable summary often performs comparably with or better than direct LLM prediction, while enjoying better interpretability and shorter model input length.

Abstract

Most conventional recommendation methods (e.g., matrix factorization) represent user profiles as high-dimensional vectors. Unfortunately, these vectors lack interpretability and steerability, and often perform poorly in cold-start settings. To address these shortcomings, we explore the use of user profiles that are represented as human-readable text. We propose the Language-based Factorization Model (LFM), which is essentially an encoder/decoder model where both the encoder and the decoder are large language models (LLMs). The encoder LLM generates a compact natural-language profile of the user's interests from the user's rating history. The decoder LLM uses this summary profile to complete predictive downstream tasks. We evaluate our LFM approach on the MovieLens dataset, comparing it against matrix factorization and an LLM model that directly predicts from the user's rating history. In cold-start settings, we find that our method can have higher accuracy than matrix factorization. Furthermore, we find that generating a compact and human-readable summary often performs comparably with or better than direct LLM prediction, while enjoying better interpretability and shorter model input length. Our results motivate a number of future research directions and potential improvements.
Paper Structure (15 sections, 21 figures, 1 table)

This paper contains 15 sections, 21 figures, 1 table.

Figures (21)

  • Figure 1: Summary of how our user representation method works and what tasks we tested it on
  • Figure 2: Fraction of readable predictions for all tasks with different methods and models vs history size.
  • Figure 3: Performance (RMSE, MAE, and error rate) for all tasks with different methods (using Llama 2 13B) vs history size.
  • Figure 4: Bias (mean error) of rating prediction task with different methods and models vs history sizes.
  • Figure 5: Performance metrics (RMSE, MAE, and error rate) for all tasks with different LFM summary lengths with history size 30.
  • ...and 16 more figures