TEARS: Textual Representations for Scrutable Recommendations
Emiliano Penaloza, Olivier Gouvert, Haolun Wu, Laurent Charlin
TL;DR
This work tackles the opacity and lack of user control in traditional latent-user representations by proposing TEARS, which encodes user preferences as natural-text summaries generated by a modern LLM. TEARS aligns these text-based representations with a standard VAE for collaborative filtering through an optimal-transport objective, and blends them with the learned latent space via a mixing coefficient $\alpha$ to balance transparency and performance. The approach yields high-quality recommendations while enabling user edits to directly steer rankings, demonstrated across MovieLens-1M, Netflix, and Goodbooks with robust controllability under large-scale and fine-grained edits and guided prompts. The work introduces a practical, scrutable, and controllable framework for recommender systems, with significant implications for user autonomy and transparency in personalized content.
Abstract
Traditional recommender systems rely on high-dimensional (latent) embeddings for modeling user-item interactions, often resulting in opaque representations that lack interpretability. Moreover, these systems offer limited control to users over their recommendations. Inspired by recent work, we introduce TExtuAl Representations for Scrutable recommendations (TEARS) to address these challenges. Instead of representing a user's interests through a latent embedding, TEARS encodes them in natural text, providing transparency and allowing users to edit them. To do so, TEARS uses a modern LLM to generate user summaries based on user preferences. We find the summaries capture user preferences uniquely. Using these summaries, we take a hybrid approach where we use an optimal transport procedure to align the summaries' representation with the learned representation of a standard VAE for collaborative filtering. We find this approach can surpass the performance of three popular VAE models while providing user-controllable recommendations. We also analyze the controllability of TEARS through three simulated user tasks to evaluate the effectiveness of a user editing its summary.
