Table of Contents
Fetching ...

BvSP: Broad-view Soft Prompting for Few-Shot Aspect Sentiment Quad Prediction

Yinhao Bai, Yalan Xie, Xiaoyi Liu, Yuhua Zhao, Zhixin Han, Mengting Hu, Hang Gao, Renhong Cheng

TL;DR

This paper tackles aspect sentiment quad prediction (ASQP) under few-shot settings by introducing a new FSQP dataset that better reflects unseen/aspect distribution shifts and by proposing Broad-view Soft Prompting (BvSP). BvSP selects multiple templates through Jensen-Shannon divergence to capture correlations among templates, uses soft prompts to adapt a pre-trained language model, and aggregates predictions via voting to produce quad outputs. Empirical results show that BvSP outperforms strong baselines across one-, two-, five-, and ten-shot scenarios and on full-shot Rest15/Rest16 datasets, with notable gains in handling both explicit and implicit information. The work advances practical few-shot ASQP by combining data augmentation-like template diversity with prompt-tuning, enabling faster adaptation to new aspects in real-world applications.

Abstract

Aspect sentiment quad prediction (ASQP) aims to predict four aspect-based elements, including aspect term, opinion term, aspect category, and sentiment polarity. In practice, unseen aspects, due to distinct data distribution, impose many challenges for a trained neural model. Motivated by this, this work formulates ASQP into the few-shot scenario, which aims for fast adaptation in real applications. Therefore, we first construct a few-shot ASQP dataset (FSQP) that contains richer categories and is more balanced for the few-shot study. Moreover, recent methods extract quads through a generation paradigm, which involves converting the input sentence into a templated target sequence. However, they primarily focus on the utilization of a single template or the consideration of different template orders, thereby overlooking the correlations among various templates. To tackle this issue, we further propose a Broadview Soft Prompting (BvSP) method that aggregates multiple templates with a broader view by taking into account the correlation between the different templates. Specifically, BvSP uses the pre-trained language model to select the most relevant k templates with Jensen-Shannon divergence. BvSP further introduces soft prompts to guide the pre-trained language model using the selected templates. Then, we aggregate the results of multi-templates by voting mechanism. Empirical results demonstrate that BvSP significantly outperforms the stateof-the-art methods under four few-shot settings and other public datasets. Our code and dataset are available at https://github.com/byinhao/BvSP.

BvSP: Broad-view Soft Prompting for Few-Shot Aspect Sentiment Quad Prediction

TL;DR

This paper tackles aspect sentiment quad prediction (ASQP) under few-shot settings by introducing a new FSQP dataset that better reflects unseen/aspect distribution shifts and by proposing Broad-view Soft Prompting (BvSP). BvSP selects multiple templates through Jensen-Shannon divergence to capture correlations among templates, uses soft prompts to adapt a pre-trained language model, and aggregates predictions via voting to produce quad outputs. Empirical results show that BvSP outperforms strong baselines across one-, two-, five-, and ten-shot scenarios and on full-shot Rest15/Rest16 datasets, with notable gains in handling both explicit and implicit information. The work advances practical few-shot ASQP by combining data augmentation-like template diversity with prompt-tuning, enabling faster adaptation to new aspects in real-world applications.

Abstract

Aspect sentiment quad prediction (ASQP) aims to predict four aspect-based elements, including aspect term, opinion term, aspect category, and sentiment polarity. In practice, unseen aspects, due to distinct data distribution, impose many challenges for a trained neural model. Motivated by this, this work formulates ASQP into the few-shot scenario, which aims for fast adaptation in real applications. Therefore, we first construct a few-shot ASQP dataset (FSQP) that contains richer categories and is more balanced for the few-shot study. Moreover, recent methods extract quads through a generation paradigm, which involves converting the input sentence into a templated target sequence. However, they primarily focus on the utilization of a single template or the consideration of different template orders, thereby overlooking the correlations among various templates. To tackle this issue, we further propose a Broadview Soft Prompting (BvSP) method that aggregates multiple templates with a broader view by taking into account the correlation between the different templates. Specifically, BvSP uses the pre-trained language model to select the most relevant k templates with Jensen-Shannon divergence. BvSP further introduces soft prompts to guide the pre-trained language model using the selected templates. Then, we aggregate the results of multi-templates by voting mechanism. Empirical results demonstrate that BvSP significantly outperforms the stateof-the-art methods under four few-shot settings and other public datasets. Our code and dataset are available at https://github.com/byinhao/BvSP.
Paper Structure (34 sections, 5 equations, 7 figures, 14 tables)

This paper contains 34 sections, 5 equations, 7 figures, 14 tables.

Figures (7)

  • Figure 1: A unseen aspect case is shown. The newly emerged category 'internet" is not mentioned in the pre-defined set of aspect categories.
  • Figure 2: The category distribution is presented according to the number of instances. For example, the green section indicates the proportion of categories with the number of instances between 1 and 50.
  • Figure 3: An overview of the proposed Broad-view Soft Prompting (BvSP). The single-template prediction is Paraphrase zhang2021towards-generative. The multi-order prediction approach is DLO hu2022improving. BvSP combines these templates as candidates and proposes a correlation-guided strategy for template selection.
  • Figure 4: Quad elements (colored ones) of the output sequence from the decoder are filtered, which are leveraged for computing template correlations.
  • Figure 5: Effects of hyperparameters on $\mathtt{FSQP}$ under the one-shot settings.
  • ...and 2 more figures