Unifying the Scope of Bridging Anaphora Types in English: Bridging Annotations in ARRAU and GUM

Lauren Levine; Amir Zeldes

Unifying the Scope of Bridging Anaphora Types in English: Bridging Annotations in ARRAU and GUM

Lauren Levine, Amir Zeldes

TL;DR

There is a large difference in types of phenomena annotated as bridging in the GUM, GENTLE and ARRAU corpora, finding that there is a large difference in types of phenomena annotated as bridging.

Abstract

Comparing bridging annotations across coreference resources is difficult, largely due to a lack of standardization across definitions and annotation schemas and narrow coverage of disparate text domains across resources. To alleviate domain coverage issues and consolidate schemas, we compare guidelines and use interpretable predictive models to examine the bridging instances annotated in the GUM, GENTLE and ARRAU corpora. Examining these cases, we find that there is a large difference in types of phenomena annotated as bridging. Beyond theoretical results, we release a harmonized, subcategorized version of the test sets of GUM, GENTLE and the ARRAU Wall Street Journal data to promote meaningful and reliable evaluation of bridging resolution across domains.

Unifying the Scope of Bridging Anaphora Types in English: Bridging Annotations in ARRAU and GUM

TL;DR

There is a large difference in types of phenomena annotated as bridging in the GUM, GENTLE and ARRAU corpora, finding that there is a large difference in types of phenomena annotated as bridging.

Abstract

Paper Structure (16 sections, 4 figures, 8 tables)

This paper contains 16 sections, 4 figures, 8 tables.

Introduction
Background
Categorical Differences
Previously mentioned anaphors
Split bridging antecedents
Discontinuous mention spans
Entity types
Bridging subtypes
Predictive Models
Data
Models
Feature Analysis
Cross-Corpus Error Analysis
Harmonized Test Sets
Conclusion
...and 1 more sections

Figures (4)

Figure 1: Feature importance of XGBoost classifiers trained on GUM and ARRAU WSJ
Figure 2: Distribution for antecedent-anaphor entity type combinations for GUM/GENTLE (only combinations with a proportion of 1% or higher are visualized)
Figure 3: Distribution for antecedent-anaphor entity type combinations for ARRAU WSJ (only combinations with a proportion of 1% or higher are visualized)
Figure 4: Feature importance of XGBoost classifiers trained on GUM and ARRAU WSJ for Mean Decrease Accuracy (MDA)

Unifying the Scope of Bridging Anaphora Types in English: Bridging Annotations in ARRAU and GUM

TL;DR

Abstract

Unifying the Scope of Bridging Anaphora Types in English: Bridging Annotations in ARRAU and GUM

Authors

TL;DR

Abstract

Table of Contents

Figures (4)