BvSP: Broad-view Soft Prompting for Few-Shot Aspect Sentiment Quad Prediction
Yinhao Bai, Yalan Xie, Xiaoyi Liu, Yuhua Zhao, Zhixin Han, Mengting Hu, Hang Gao, Renhong Cheng
TL;DR
This paper tackles aspect sentiment quad prediction (ASQP) under few-shot settings by introducing a new FSQP dataset that better reflects unseen/aspect distribution shifts and by proposing Broad-view Soft Prompting (BvSP). BvSP selects multiple templates through Jensen-Shannon divergence to capture correlations among templates, uses soft prompts to adapt a pre-trained language model, and aggregates predictions via voting to produce quad outputs. Empirical results show that BvSP outperforms strong baselines across one-, two-, five-, and ten-shot scenarios and on full-shot Rest15/Rest16 datasets, with notable gains in handling both explicit and implicit information. The work advances practical few-shot ASQP by combining data augmentation-like template diversity with prompt-tuning, enabling faster adaptation to new aspects in real-world applications.
Abstract
Aspect sentiment quad prediction (ASQP) aims to predict four aspect-based elements, including aspect term, opinion term, aspect category, and sentiment polarity. In practice, unseen aspects, due to distinct data distribution, impose many challenges for a trained neural model. Motivated by this, this work formulates ASQP into the few-shot scenario, which aims for fast adaptation in real applications. Therefore, we first construct a few-shot ASQP dataset (FSQP) that contains richer categories and is more balanced for the few-shot study. Moreover, recent methods extract quads through a generation paradigm, which involves converting the input sentence into a templated target sequence. However, they primarily focus on the utilization of a single template or the consideration of different template orders, thereby overlooking the correlations among various templates. To tackle this issue, we further propose a Broadview Soft Prompting (BvSP) method that aggregates multiple templates with a broader view by taking into account the correlation between the different templates. Specifically, BvSP uses the pre-trained language model to select the most relevant k templates with Jensen-Shannon divergence. BvSP further introduces soft prompts to guide the pre-trained language model using the selected templates. Then, we aggregate the results of multi-templates by voting mechanism. Empirical results demonstrate that BvSP significantly outperforms the stateof-the-art methods under four few-shot settings and other public datasets. Our code and dataset are available at https://github.com/byinhao/BvSP.
