TacoERE: Cluster-aware Compression for Event Relation Extraction
Yong Guan, Xiaozhi Wang, Lei Hou, Juanzi Li, Jeff Pan, Jiaoyan Chen, Freddy Lecue
TL;DR
TacoERE introduces a cluster-aware compression framework for event relation extraction that treats document understanding as a compression-then-extraction problem. By partitioning a document into intra- and inter-clusters and generating concise cluster summaries, TacoERE mitigates long-range dependencies and information redundancy, enabling effective relation prediction. The model combines a clustering objective, transformer-based summarization, RoBERTa-based relation prediction, and reinforcement-learning-based joint training with an event-chain pretraining step. Empirical results on MAVEN-ERE, EventStoryLine, and HiEve show consistent gains for both small PLMs and large language models, with particularly strong improvements on long-distance relations and when using LLMs.
Abstract
Event relation extraction (ERE) is a critical and fundamental challenge for natural language processing. Existing work mainly focuses on directly modeling the entire document, which cannot effectively handle long-range dependencies and information redundancy. To address these issues, we propose a cluster-aware compression method for improving event relation extraction (TacoERE), which explores a compression-then-extraction paradigm. Specifically, we first introduce document clustering for modeling event dependencies. It splits the document into intra- and inter-clusters, where intra-clusters aim to enhance the relations within the same cluster, while inter-clusters attempt to model the related events at arbitrary distances. Secondly, we utilize cluster summarization to simplify and highlight important text content of clusters for mitigating information redundancy and event distance. We have conducted extensive experiments on both pre-trained language models, such as RoBERTa, and large language models, such as ChatGPT and GPT-4, on three ERE datasets, i.e., MAVEN-ERE, EventStoryLine and HiEve. Experimental results demonstrate that TacoERE is an effective method for ERE.
