Event Temporal Relation Extraction based on Retrieval-Augmented on LLMs

Xiaobin Zhang; Liangjun Zang; Qianwen Liu; Shuchong Wei; Songlin Hu

Event Temporal Relation Extraction based on Retrieval-Augmented on LLMs

Xiaobin Zhang, Liangjun Zang, Qianwen Liu, Shuchong Wei, Songlin Hu

TL;DR

A novel retrieval-augmented TempRel extraction approach, leveraging knowledge retrieved from large language models (LLMs) to enhance prompt templates and verbalizers to improve the performance of event temporal relation extraction tasks.

Abstract

Event temporal relation (TempRel) is a primary subject of the event relation extraction task. However, the inherent ambiguity of TempRel increases the difficulty of the task. With the rise of prompt engineering, it is important to design effective prompt templates and verbalizers to extract relevant knowledge. The traditional manually designed templates struggle to extract precise temporal knowledge. This paper introduces a novel retrieval-augmented TempRel extraction approach, leveraging knowledge retrieved from large language models (LLMs) to enhance prompt templates and verbalizers. Our method capitalizes on the diverse capabilities of various LLMs to generate a wide array of ideas for template and verbalizer design. Our proposed method fully exploits the potential of LLMs for generation tasks and contributes more knowledge to our design. Empirical evaluations across three widely recognized datasets demonstrate the efficacy of our method in improving the performance of event temporal relation extraction tasks.

Event Temporal Relation Extraction based on Retrieval-Augmented on LLMs

TL;DR

Abstract

Paper Structure (23 sections, 4 equations, 6 figures, 8 tables, 1 algorithm)

This paper contains 23 sections, 4 equations, 6 figures, 8 tables, 1 algorithm.

Introduction
Related work
TempRel Extraction
Prompt-based Learning
Retrieval-Augmented Generation
TempRel Task
Overrall Architecture
Fine-tuning PLM for RAG
PVP training and Reasoning
Experiments
Datasets
Implementation Details
Overall Performance
Performance on Individual Labels
PVP Analysis
...and 8 more sections

Figures (6)

Figure 1: The overall architecture of our model.
Figure 2: Convergence of different pvp pairs and their F1 scores in the validation dataset. The #n T denotes the #n Template in the Table.\ref{['preliminaryExp']}, and the SV, HV are the abbreviations of Soft Verbalizer, and Hard Verbalizer respectively. The #3T_SV (#3 Template and Soft Verbalizer in the Table.\ref{['preliminaryExp']}) method converges faster than other methods, and it has a higher F1 score and an appropriate loss. This experiment demonstrates the importance of carefully choosing pvps.
Figure 3: Paradigms of full-model tuning and prompt tuning. The $\langle X \rangle$ represents the traditional typical pre-training encoder-decoder mask model.
Figure 4: Modifier word frequency statistics returned by LLMs.
Figure 5: PLM and loss function selection, Where B denotes BERT, R denotes RoBERTa, T denotes T5, and G denotes GPT. Since RoBERTa-large outperforms the others, we use RoBERTa-large in the follow-up experiments. The Focal, Dice, CE, ATTN, and MLP denote FocalLoss, Dice Loss, Cross-Entropy,self-attention, and Multi-layer Perceptron, respectively.
...and 1 more figures

Event Temporal Relation Extraction based on Retrieval-Augmented on LLMs

TL;DR

Abstract

Event Temporal Relation Extraction based on Retrieval-Augmented on LLMs

Authors

TL;DR

Abstract

Table of Contents

Figures (6)