Listwise Preference Optimization with Element-wise Confusions for Aspect Sentiment Quad Prediction
Wenna Lai, Haoran Xie, Guandong Xu, Qing Li, S. Joe Qin
TL;DR
<3-5 sentence high-level summary> ASQP requires extracting a four-element sentiment quadruple (a, c, o, s) per sentence, but existing marker-based supervised fine-tuning struggles with inter-element dependencies and interpretability. The authors propose a reasoning-based generation framework that outputs both the quadruple and a synchronized natural-language rationale, coupled with Element-wise Confusable Candidates and a listwise preference optimization (E4L) to enforce structural validity and relational coherence. They build confusable candidates using syntactic distance and semantic similarity, training with a listwise objective that ranks the gold output above all confusions. Across four benchmark datasets, E4L consistently surpasses non-generative, generative, and collaborative baselines, achieving higher quad F1 and more coherent explanations, demonstrating improved robustness and interpretability in ASQP tasks.
Abstract
Aspect sentiment quad prediction (ASQP) is inherently challenging to predict a structured quadruple with four core sentiment elements, including aspect term (a), aspect category (c), opinion term (o), and sentiment polarity (s). Prior methods relying on marker-based prediction struggle with modeling the intricate relationships among elements and experience sharp performance declines when predicting higher-order elements (e.g., c and s) under standard supervised fine-tuning. To address these limitations, we employ reasoning-based generation to output both the quadruple and a natural language rationale under element prefixes within a unified template, encouraging explicit relational reasoning and interpretability. To further enhance element-wise alignment, we introduce a listwise preference optimization framework for improving structural validity and relational coherence. Specifically, we generate element-wise confusable candidates via syntactic and semantic proximity, then train the model with listwise objectives to prefer the gold candidates over closely competing alternatives. Extensive experiments on four benchmark datasets demonstrate that our framework effectively improves quadruple prediction accuracy and explanation consistency.
