Generating Harder Cross-document Event Coreference Resolution Datasets using Metaphoric Paraphrasing

Shafiuddin Rehan Ahmed; Zhiyong Eric Wang; George Arthur Baker; Kevin Stowe; James H. Martin

Generating Harder Cross-document Event Coreference Resolution Datasets using Metaphoric Paraphrasing

Shafiuddin Rehan Ahmed, Zhiyong Eric Wang, George Arthur Baker, Kevin Stowe, James H. Martin

TL;DR

This paper identifies a key gap in cross-document event coreference resolution: current datasets underestimate task difficulty due to lexical overlap and lack of figurative language. It introduces ECB+META by applying constrained metaphoric paraphrasing to ECB+ triggers using GPT-4, preserving coreference annotations, and producing two variants with different metaphor granularity. Through experiments with filtering-based and cross-encoder CDEC methods, plus GPT-4 as a pairwise classifier, the authors demonstrate that standard approaches struggle on ECB+META, highlighting the need for more robust, figurative-language-aware models. The work also provides analyses of lexical diversity and human agreement, offering a reproducible data and methodology framework and a foundation for future research in challenging CDEC benchmarks and evaluation metrics.

Abstract

The most popular Cross-Document Event Coreference Resolution (CDEC) datasets fail to convey the true difficulty of the task, due to the lack of lexical diversity between coreferring event triggers (words or phrases that refer to an event). Furthermore, there is a dearth of event datasets for figurative language, limiting a crucial avenue of research in event comprehension. We address these two issues by introducing ECB+META, a lexically rich variant of Event Coref Bank Plus (ECB+) for CDEC on symbolic and metaphoric language. We use ChatGPT as a tool for the metaphoric transformation of sentences in the documents of ECB+, then tag the original event triggers in the transformed sentences in a semi-automated manner. In this way, we avoid the re-annotation of expensive coreference links. We present results that show existing methods that work well on ECB+ struggle with ECB+META, thereby paving the way for CDEC research on a much more challenging dataset. Code/data: https://github.com/ahmeshaf/llms_coref

Generating Harder Cross-document Event Coreference Resolution Datasets using Metaphoric Paraphrasing

TL;DR

Abstract

Paper Structure (27 sections, 1 equation, 6 figures, 4 tables)

This paper contains 27 sections, 1 equation, 6 figures, 4 tables.

Introduction
Related Work
CDEC Datasets
Metaphoric Paraphrasing
CDEC Methods
Methodology
Metaphoric Paraphrasing using GPT-4
CDEC Methods
Filtering Step for CDEC:
Cross-encoder:
GPT-4 as Pairwise Classifier:
Results
Metaphor Quality Control
Coreference & Lexical Diversity
Filtering Scores:
...and 12 more sections

Figures (6)

Figure 1: Using GPT-4 to Generate ECB+META from ECB+ Corpus. Event 2 & Event 3 are coreferent, while Event 1 is not. ECB+META has metaphorically transformed triggers, e.g., killing -> silencing the life. The triggers are hand-corrected by an annotator. ECB+META challenges previous work---held-etal-2021-focus & ahmed-etal-2023-2.
Figure 2: Metaphoric Paraphrasing Prompt following Chain of Thought Reasoning. We provide the steps in this prompt to follow.
Figure 3: Metaphoric Paraphrasing: Transforming a Sentence with Figurative Language. Event triggers, indicated in italics, undergo modification in paraphrased versions, annotated by GPT-4 with two variations.
Figure 4: Correct prediction of coreferent mention pair across all datasets with $\texttt{CE}_{\tt KNN}$. Pairs have the same event trigger in each case.
Figure 5: Correct coreference prediction in ECB+ but not in the META versions, simply because the triggers got changed.
...and 1 more figures

Generating Harder Cross-document Event Coreference Resolution Datasets using Metaphoric Paraphrasing

TL;DR

Abstract

Generating Harder Cross-document Event Coreference Resolution Datasets using Metaphoric Paraphrasing

Authors

TL;DR

Abstract

Table of Contents

Figures (6)