Table of Contents
Fetching ...

EasyECR: A Library for Easy Implementation and Evaluation of Event Coreference Resolution Models

Yuncong Li, Tianhua Xu, Sheng-hua Zhong, Haiqin Yang

TL;DR

This work introduces EasyECR, an open-source library that standardizes data structures, tagging, and evaluation for Event Coreference Resolution (ECR) pipelines, enabling easy implementation and fair cross-dataset comparisons. By reproducing seven representative pipelines across ten datasets, the authors demonstrate that state-of-the-art pipelines often fail to generalize across datasets and that pipeline performance is highly sensitive to other components and evaluation settings. EasyECR provides unified baselines and a reproducible framework to accelerate robust ECR research, with potential practical impact on cross-document event analysis. The study also identifies reproducibility challenges and highlights actionable directions for scalable evaluation and tooling improvements in ECR.

Abstract

Event Coreference Resolution (ECR) is the task of clustering event mentions that refer to the same real-world event. Despite significant advancements, ECR research faces two main challenges: limited generalizability across domains due to narrow dataset evaluations, and difficulties in comparing models within diverse ECR pipelines. To address these issues, we develop EasyECR, the first open-source library designed to standardize data structures and abstract ECR pipelines for easy implementation and fair evaluation. More specifically, EasyECR integrates seven representative pipelines and ten popular benchmark datasets, enabling model evaluations on various datasets and promoting the development of robust ECR pipelines. By conducting extensive evaluation via our EasyECR, we find that, \lowercase\expandafter{\romannumeral1}) the representative ECR pipelines cannot generalize across multiple datasets, hence evaluating ECR pipelines on multiple datasets is necessary, \lowercase\expandafter{\romannumeral2}) all models in ECR pipelines have a great effect on pipeline performance, therefore, when one model in ECR pipelines are compared, it is essential to ensure that the other models remain consistent. Additionally, reproducing ECR results is not trivial, and the developed library can help reduce this discrepancy. The experimental results provide valuable baselines for future research.

EasyECR: A Library for Easy Implementation and Evaluation of Event Coreference Resolution Models

TL;DR

This work introduces EasyECR, an open-source library that standardizes data structures, tagging, and evaluation for Event Coreference Resolution (ECR) pipelines, enabling easy implementation and fair cross-dataset comparisons. By reproducing seven representative pipelines across ten datasets, the authors demonstrate that state-of-the-art pipelines often fail to generalize across datasets and that pipeline performance is highly sensitive to other components and evaluation settings. EasyECR provides unified baselines and a reproducible framework to accelerate robust ECR research, with potential practical impact on cross-document event analysis. The study also identifies reproducibility challenges and highlights actionable directions for scalable evaluation and tooling improvements in ECR.

Abstract

Event Coreference Resolution (ECR) is the task of clustering event mentions that refer to the same real-world event. Despite significant advancements, ECR research faces two main challenges: limited generalizability across domains due to narrow dataset evaluations, and difficulties in comparing models within diverse ECR pipelines. To address these issues, we develop EasyECR, the first open-source library designed to standardize data structures and abstract ECR pipelines for easy implementation and fair evaluation. More specifically, EasyECR integrates seven representative pipelines and ten popular benchmark datasets, enabling model evaluations on various datasets and promoting the development of robust ECR pipelines. By conducting extensive evaluation via our EasyECR, we find that, \lowercase\expandafter{\romannumeral1}) the representative ECR pipelines cannot generalize across multiple datasets, hence evaluating ECR pipelines on multiple datasets is necessary, \lowercase\expandafter{\romannumeral2}) all models in ECR pipelines have a great effect on pipeline performance, therefore, when one model in ECR pipelines are compared, it is essential to ensure that the other models remain consistent. Additionally, reproducing ECR results is not trivial, and the developed library can help reduce this discrepancy. The experimental results provide valuable baselines for future research.
Paper Structure (15 sections, 4 figures, 12 tables)

This paper contains 15 sections, 4 figures, 12 tables.

Figures (4)

  • Figure 1: An example of event coreference resolution, which contains three coreferential chains from two documents: {saying, reporting}, {negotiations, negotiations} and {direct, direct}.
  • Figure 2: The ECR task is accomplished through a pipeline comprising multiple models. Here is a concise breakdown: (a) ECR pipeline: The pipeline consists of multiple stages, each comprising two steps: computing mention pair distances and clustering mentions. Each step involves a specific model. (b) Illustration of a two-stage ECR pipeline: In the first stage, mentions are divided into clusters, where co-referential event mentions are likely. The second stage further divides these clusters into smaller ones, indicating coreferential event mentions. (c) Finding coreferential mentions: This process resembles item recommendation in recommender systems 10.1145/2959100.2959190ijcai2022p0771. It involves identifying coreferential mentions for a given mention by progressing through the pipeline stages.
  • Figure 3: The framework of EasyECR. The Part 2 includes a loop corresponding to Figure \ref{['fig:ecr-pipeline-model-vs-rec']} (a).
  • Figure 4: An illustration showing how four CDCR pipelines are constructed using ECRTaggers.