Comparative Opinion Mining in Product Reviews: Multi-perspective Prompt-based Learning
Hai-Yen Thi Nguyen, Cam-Van Thi Nguyen
TL;DR
This paper tackles the challenge of extracting comparative opinions from reviews by introducing MTP-COQE, an end-to-end framework that performs Comparative Quintuple Extraction via multi-perspective prompt-based learning. The method decomposes into three components: multi-perspective augmentation to enrich training data, transfer learning with a unified generative prompt template based on the $Q_5=(S,O,A,P,L)$ structure, and constrained decoding to restrict generation to valid tokens. Empirical results on Camera-COQE (English) and VCOM (Vietnamese) show state-of-the-art or competitive performance, with notable gains in end-to-end $E$-$Q5$-F1 and strong cross-language performance, though Vietnamese results reveal residual challenges and data imbalance effects. The work emphasizes the benefits and limitations of generative models for structured information extraction and suggests future work on integrating external knowledge and improving controllability and efficiency of such systems.
Abstract
Comparative reviews are pivotal in understanding consumer preferences and influencing purchasing decisions. Comparative Quintuple Extraction (COQE) aims to identify five key components in text: the target entity, compared entities, compared aspects, opinions on these aspects, and polarity. Extracting precise comparative information from product reviews is challenging due to nuanced language and sequential task errors in traditional methods. To mitigate these problems, we propose MTP-COQE, an end-to-end model designed for COQE. Leveraging multi-perspective prompt-based learning, MTP-COQE effectively guides the generative model in comparative opinion mining tasks. Evaluation on the Camera-COQE (English) and VCOM (Vietnamese) datasets demonstrates MTP-COQE's efficacy in automating COQE, achieving superior performance with a 1.41% higher F1 score than the previous baseline models on the English dataset. Additionally, we designed a strategy to limit the generative model's creativity to ensure the output meets expectations. We also performed data augmentation to address data imbalance and to prevent the model from becoming biased towards dominant samples.
