Table of Contents
Fetching ...

MiniConGTS: A Near Ultimate Minimalist Contrastive Grid Tagging Scheme for Aspect Sentiment Triplet Extraction

Qiao Sun, Liujia Yang, Minghao Ma, Nanyang Ye, Qinying Gu

TL;DR

This study proposes a method to improve and utilize pretrained representations by integrating a minimalist tagging scheme and a novel token-level contrastive learning strategy and demonstrates comparable or superior performance compared to state-of-the-art techniques while featuring a more compact design and reduced computational overhead.

Abstract

Aspect Sentiment Triplet Extraction (ASTE) aims to co-extract the sentiment triplets in a given corpus. Existing approaches within the pretraining-finetuning paradigm tend to either meticulously craft complex tagging schemes and classification heads, or incorporate external semantic augmentation to enhance performance. In this study, we, for the first time, re-evaluate the redundancy in tagging schemes and the internal enhancement in pretrained representations. We propose a method to improve and utilize pretrained representations by integrating a minimalist tagging scheme and a novel token-level contrastive learning strategy. The proposed approach demonstrates comparable or superior performance compared to state-of-the-art techniques while featuring a more compact design and reduced computational overhead. Additionally, we are the first to formally evaluate GPT-4's performance in few-shot learning and Chain-of-Thought scenarios for this task. The results demonstrate that the pretraining-finetuning paradigm remains highly effective even in the era of large language models.

MiniConGTS: A Near Ultimate Minimalist Contrastive Grid Tagging Scheme for Aspect Sentiment Triplet Extraction

TL;DR

This study proposes a method to improve and utilize pretrained representations by integrating a minimalist tagging scheme and a novel token-level contrastive learning strategy and demonstrates comparable or superior performance compared to state-of-the-art techniques while featuring a more compact design and reduced computational overhead.

Abstract

Aspect Sentiment Triplet Extraction (ASTE) aims to co-extract the sentiment triplets in a given corpus. Existing approaches within the pretraining-finetuning paradigm tend to either meticulously craft complex tagging schemes and classification heads, or incorporate external semantic augmentation to enhance performance. In this study, we, for the first time, re-evaluate the redundancy in tagging schemes and the internal enhancement in pretrained representations. We propose a method to improve and utilize pretrained representations by integrating a minimalist tagging scheme and a novel token-level contrastive learning strategy. The proposed approach demonstrates comparable or superior performance compared to state-of-the-art techniques while featuring a more compact design and reduced computational overhead. Additionally, we are the first to formally evaluate GPT-4's performance in few-shot learning and Chain-of-Thought scenarios for this task. The results demonstrate that the pretraining-finetuning paradigm remains highly effective even in the era of large language models.
Paper Structure (50 sections, 8 equations, 6 figures, 12 tables, 1 algorithm)

This paper contains 50 sections, 8 equations, 6 figures, 12 tables, 1 algorithm.

Figures (6)

  • Figure 1: An illustration for ASTE, given the sentence "Bob Dylan is a great rocker, despite the broken CDs.", there are three triplets to be extracted: (Bob Dylan, great, positive), (rocker, great, positive), (CDs, broken, negative).
  • Figure 2: An overview of the proposed method, where the "Encoder" denotes for the sequential combination of a Tokenizer and a Pretrained Language Model (PLM).
  • Figure 3: The grid tagging scheme employs the fewest classes of labels while completely handle all the triplet cases without conflict, overlap or omission. Each area circled in red dashed lines corresponds to a triplets. For example, intersection area between columns of "broken" and rows of "CDs" is marked as negative, with NEG. on its top-left cell and CTD. for others. It is worth mentioning that the blank cells in the matrix are labeled as an additional class but are omitted for visual simplicity.
  • Figure 4: An illustration for the "Contrastive Mask". Each token is paired with every other token, where PULL denotes positive sample pairs, indicating that the tokens belong to the same category and should be pulled closer together, while PUSH denotes negative sample pairs, indicating that the tokens belong to different categories and should be pushed apart. The lower triangular part of the matrix, marked by MSK. are masked cells that are not involved in the computation. For example, "Bob" and "Dylan" are marked as a positive sample pair with PULL, indicating similarity, while "Bob" and "is" are marked as a negative sample pair with PUSH, indicating dissimilarity.
  • Figure 5: A plot of the hidden word representation based on the $\mathcal{D}_1$ 14Res dataset, where the dimension is reduced to 3. "Pretrained" refers to the representation output by official released model. We finetune the pretrained model with and without contrastive learning strategy respectively.
  • ...and 1 more figures