Collaborative Retrieval for Large Language Model-based Conversational Recommender Systems
Yaochen Zhu, Chao Wan, Harald Steck, Dawen Liang, Yesu Feng, Nathan Kallus, Jundong Li
TL;DR
CRAG tackles the challenge of leveraging collaborative filtering within black-box LLM-based conversational recommender systems by introducing context-aware collaborative retrieval and a two-step reflection pipeline. It links dialogue to items via LLM-based entity extraction and bi-level matching, augments CF signals through context-aware retrieval using an adapted EASE objective, and mitigates LLM bias with reflect-and-rerank to produce quality top-N recommendations. Across Reddit-v2 and Redial, CRAG consistently outperforms zero-shot LLMs and Naive-RAG baselines, with the most pronounced gains for recently released items, and ablations demonstrate the necessity of both reflection stages. The work also provides a refined Reddit-v2 dataset and releases code/data, offering a practical benchmark and pathway for advancing LLM+CF CRS research.
Abstract
Conversational recommender systems (CRS) aim to provide personalized recommendations via interactive dialogues with users. While large language models (LLMs) enhance CRS with their superior understanding of context-aware user preferences, they typically struggle to leverage behavioral data, which have proven to be important for classical collaborative filtering (CF)-based approaches. For this reason, we propose CRAG, Collaborative Retrieval Augmented Generation for LLM-based CRS. To the best of our knowledge, CRAG is the first approach that combines state-of-the-art LLMs with CF for conversational recommendations. Our experiments on two publicly available movie conversational recommendation datasets, i.e., a refined Reddit dataset (which we name Reddit-v2) as well as the Redial dataset, demonstrate the superior item coverage and recommendation performance of CRAG, compared to several CRS baselines. Moreover, we observe that the improvements are mainly due to better recommendation accuracy on recently released movies. The code and data are available at https://github.com/yaochenzhu/CRAG.
