Table of Contents
Fetching ...

BERT for Coreference Resolution: Baselines and Analysis

Mandar Joshi, Omer Levy, Daniel S. Weld, Luke Zettlemoyer

TL;DR

The paper examines fine-tuning BERT for coreference resolution within the c2f-coref framework, evaluating two segment strategies to cope with long documents on GAP and OntoNotes. BERT-large delivers substantial gains on GAP (+11.5% F1) and meaningful improvements on OntoNotes (+3.9% F1), with notable gains in pronoun resolution and lexical matching but limited benefits from longer contextual windows. The work highlights remaining challenges in document-level context, conversations, and paraphrase handling, and shows that overlap-based context extension may not help. Overall, it demonstrates the potential of transformer-based representations for coreference while outlining directions for better long-range encoding and pretraining.

Abstract

We apply BERT to coreference resolution, achieving strong improvements on the OntoNotes (+3.9 F1) and GAP (+11.5 F1) benchmarks. A qualitative analysis of model predictions indicates that, compared to ELMo and BERT-base, BERT-large is particularly better at distinguishing between related but distinct entities (e.g., President and CEO). However, there is still room for improvement in modeling document-level context, conversations, and mention paraphrasing. Our code and models are publicly available.

BERT for Coreference Resolution: Baselines and Analysis

TL;DR

The paper examines fine-tuning BERT for coreference resolution within the c2f-coref framework, evaluating two segment strategies to cope with long documents on GAP and OntoNotes. BERT-large delivers substantial gains on GAP (+11.5% F1) and meaningful improvements on OntoNotes (+3.9% F1), with notable gains in pronoun resolution and lexical matching but limited benefits from longer contextual windows. The work highlights remaining challenges in document-level context, conversations, and paraphrase handling, and shows that overlap-based context extension may not help. Overall, it demonstrates the potential of transformer-based representations for coreference while outlining directions for better long-range encoding and pretraining.

Abstract

We apply BERT to coreference resolution, achieving strong improvements on the OntoNotes (+3.9 F1) and GAP (+11.5 F1) benchmarks. A qualitative analysis of model predictions indicates that, compared to ELMo and BERT-base, BERT-large is particularly better at distinguishing between related but distinct entities (e.g., President and CEO). However, there is still room for improvement in modeling document-level context, conversations, and mention paraphrasing. Our code and models are publicly available.

Paper Structure

This paper contains 15 sections, 3 equations, 5 tables.