CPR: Leveraging LLMs for Topic and Phrase Suggestion to Facilitate Comprehensive Product Reviews
Ekta Gujral, Apurva Sinha, Lishi Ji, Bijayani Sanghamitra Mishra
TL;DR
CPR addresses the need for comprehensive product reviews by coupling large language models with topic modeling to generate topic-focused, sentiment-aligned phrases. The method features a three-stage workflow: surface product-specific topics and ratings, generate targeted phrases, and integrate user text via topic modeling, with evaluation on Walmart data showing a $12.3\%$ BLEU improvement over baselines. A two-track phrase-generation strategy (Bison-based first pass and a fine-tuned LLM CPR) reveals that fine-tuning with LoRA and PEFT yields superior sentiment capture and phrase quality, evidenced by BLEU results where CPR outperforms both baselines across n-grams. Case studies across Perfumes, Toys, and Ruffled Tops demonstrate CPR’s topic-suggestion accuracy ($ ext{avg}$ $79.3\%$) and its ability to produce diverse, sentiment-consistent phrases, suggesting practical benefits for retailers and consumers by saving time and improving review usefulness. Overall, CPR advances structured review generation by aligning topics, ratings, and phrasing, with potential extensions in naturalness, personalization, and domain adaptation.
Abstract
Consumers often heavily rely on online product reviews, analyzing both quantitative ratings and textual descriptions to assess product quality. However, existing research hasn't adequately addressed how to systematically encourage the creation of comprehensive reviews that capture both customers sentiment and detailed product feature analysis. This paper presents CPR, a novel methodology that leverages the power of Large Language Models (LLMs) and Topic Modeling to guide users in crafting insightful and well-rounded reviews. Our approach employs a three-stage process: first, we present users with product-specific terms for rating; second, we generate targeted phrase suggestions based on these ratings; and third, we integrate user-written text through topic modeling, ensuring all key aspects are addressed. We evaluate CPR using text-to-text LLMs, comparing its performance against real-world customer reviews from Walmart. Our results demonstrate that CPR effectively identifies relevant product terms, even for new products lacking prior reviews, and provides sentiment-aligned phrase suggestions, saving users time and enhancing reviews quality. Quantitative analysis reveals a 12.3% improvement in BLEU score over baseline methods, further supported by manual evaluation of generated phrases. We conclude by discussing potential extensions and future research directions.
