Snippet-based Conversational Recommender System
Haibo Sun, Naoki Otani, Hannah Kim, Dan Zhang, Nikita Bhutani
TL;DR
SnipRec introduces a snippet-based, resource-efficient CRS that uses user-generated content and LLM-driven extraction to capture diverse item knowledge and user preferences. By decomposing item reviews and seeker responses into atomic snippets, applying query expansion, and performing dense retrieval with NLIs-based re-ranking, SnipRec achieves substantial gains in Hits@10 across restaurant, book, and clothing domains while reducing domain-specific annotation needs. The approach is evaluated with an LLM-based user simulator, showing high fidelity in simulated conversations and reliable snippet quality (low hallucination, high atomicity). Together, these contributions enable domain-adaptive CRS without heavy in-domain data collection or fine-tuning, though reliance on UGC and LLMs introduces considerations around data quality, costs, and potential biases.
Abstract
Conversational Recommender Systems (CRS) engage users in interactive dialogues to gather preferences and provide personalized recommendations. While existing studies have advanced conversational strategies, they often rely on predefined attributes or expensive, domain-specific annotated datasets, which limits their flexibility in handling diverse user preferences and adaptability across domains. We propose SnipRec, a novel resource-efficient approach that leverages user-generated content, such as customer reviews, to capture a broader range of user expressions. By employing large language models to map reviews and user responses into concise snippets, SnipRec represents user preferences and retrieves relevant items without the need for intensive manual data collection or fine-tuning. Experiments across the restaurant, book, and clothing domains show that snippet-based representations outperform document- and sentence-based representations, achieving Hits@10 of 0.25-0.55 with 3,000 to 10,000 candidate items while successfully handling free-form user responses.
