Recommendations by Concise User Profiles from Review Text

Ghazaleh Haratinezhad Torbati; Anna Tigunova; Andrew Yates; Gerhard Weikum

Recommendations by Concise User Profiles from Review Text

Ghazaleh Haratinezhad Torbati, Anna Tigunova, Andrew Yates, Gerhard Weikum

TL;DR

The paper addresses recommender systems for users with sparse interactions but rich review text by introducing CUP, a lightweight two-tower transformer framework that encodes concise user profiles from reviews. It emphasizes efficient text selection (e.g., idf-based sentences) to fit a $128$-token input budget and combines user profiles with item metadata in a BERT-based encoding, optimized with binary cross-entropy and end-to-end training. CUP outperforms baselines, including LLM-based ranking, across standard and search-based evaluation in the book domain, while offering scalable efficiency and interpretability through explicit user profiles. The work provides practical guidance on profiling strategies, showing that low-cost, informative profiles often rival or exceed more costly generative profiles, with trade-offs in readability and faithfulness.

Abstract

Recommender systems perform well for popular items and users with ample interactions (likes, ratings etc.). This work addresses the difficult and underexplored case of users who have very sparse interactions but post informative review texts. This setting naturally calls for encoding user-specific text with large language models (LLM). However, feeding the full text of all reviews through an LLM has a weak signal-to-noise ratio and incurs high costs of processed tokens. This paper addresses these two issues. It presents a light-weight framework, called CUP, which first computes concise user profiles and feeds only these into the training of transformer-based recommenders. For user profiles, we devise various techniques to select the most informative cues from noisy reviews. Experiments, with book reviews data, show that fine-tuning a small language model with judiciously constructed profiles achieves the best performance, even in comparison to LLM-generated rankings.

Recommendations by Concise User Profiles from Review Text

TL;DR

-token input budget and combines user profiles with item metadata in a BERT-based encoding, optimized with binary cross-entropy and end-to-end training. CUP outperforms baselines, including LLM-based ranking, across standard and search-based evaluation in the book domain, while offering scalable efficiency and interpretability through explicit user profiles. The work provides practical guidance on profiling strategies, showing that low-cost, informative profiles often rival or exceed more costly generative profiles, with trade-offs in readability and faithfulness.

Abstract

Paper Structure (15 sections, 3 figures, 6 tables)

This paper contains 15 sections, 3 figures, 6 tables.

Introduction
Related Work
Methodology
System Architecture
Training
Inference
Coping with Long and Noisy Texts
Coping with Unlabeled Data
Experimental Design
Experimental Results
Comparison of CUP against Baselines
Efficiency of CUP
Comparison of CUP Configurations
Analysis of User Profiles
Conclusion

Figures (3)

Figure 1: User-written review, with uninformative text crossed over. Personal background is in purple, pure sentiment in orange, most informative cues in green.
Figure 2: Training time for different input lengths and trainable parameters (lines are marked every 5th epoch).
Figure 4: CUP results, by user/item groups (NDCG@5 with Search-based evaluation).

Recommendations by Concise User Profiles from Review Text

TL;DR

Abstract

Recommendations by Concise User Profiles from Review Text

Authors

TL;DR

Abstract

Table of Contents

Figures (3)