OpinioRAG: Towards Generating User-Centric Opinion Highlights from Large-scale Online Reviews
Mir Tafseer Nayeem, Davood Rafiei
TL;DR
The paper tackles information overload in thousands of online reviews by introducing OpinioRAG, a scalable, training-free framework that generates user-centric opinion highlights through retrieval-augmented generation. It couples a retrieval stage that extracts relevant evidence from long-form reviews with a synthesizer that outputs structured PROS/CONS highlights in a JSON format, guided by explicit query terms. To evaluate factual alignment in sentiment-rich domains, the authors propose reference-free AOS triplet-based verification metrics—Aspect Relevance, Sentiment Factuality, and Opinion Faithfulness—paired with the OpinioBank dataset, a large-scale benchmark featuring thousands of long-form reviews per entity and expert summaries. The results demonstrate the framework’s effectiveness, reveal challenges in identifying critical drawbacks, and offer actionable insights for future improvements, including metadata incorporation and opinion-holder signals to enhance alignment and usefulness for end users.
Abstract
We study the problem of opinion highlights generation from large volumes of user reviews, often exceeding thousands per entity, where existing methods either fail to scale or produce generic, one-size-fits-all summaries that overlook personalized needs. To tackle this, we introduce OpinioRAG, a scalable, training-free framework that combines RAG-based evidence retrieval with LLMs to efficiently produce tailored summaries. Additionally, we propose novel reference-free verification metrics designed for sentiment-rich domains, where accurately capturing opinions and sentiment alignment is essential. These metrics offer a fine-grained, context-sensitive assessment of factual consistency. To facilitate evaluation, we contribute the first large-scale dataset of long-form user reviews, comprising entities with over a thousand reviews each, paired with unbiased expert summaries and manually annotated queries. Through extensive experiments, we identify key challenges, provide actionable insights into improving systems, pave the way for future research, and position OpinioRAG as a robust framework for generating accurate, relevant, and structured summaries at scale.
