Table of Contents
Fetching ...

Deep Reinforcement Learning for Mention-Ranking Coreference Models

Kevin Clark, Christopher D. Manning

TL;DR

<3-5 sentence high-level summary> The paper addresses the challenge of tuning heuristic losses for coreference resolution by directly optimizing coreference evaluation metrics through reinforcement learning. It compares a REINFORCE approach with a reward-rescaled max-margin objective within a neural mention-ranking framework, finding reward-rescaling to be superior. Experiments on English and Chinese portions of CoNLL-2012 show significant gains over prior state-of-the-art, attributed to direct metric optimization and adaptive, reward-based cost shaping. The work demonstrates that minimizing reliance on manually tuned hyperparameters can improve cross-language coreference performance and scalability.

Abstract

Coreference resolution systems are typically trained with heuristic loss functions that require careful tuning. In this paper we instead apply reinforcement learning to directly optimize a neural mention-ranking model for coreference evaluation metrics. We experiment with two approaches: the REINFORCE policy gradient algorithm and a reward-rescaled max-margin objective. We find the latter to be more effective, resulting in significant improvements over the current state-of-the-art on the English and Chinese portions of the CoNLL 2012 Shared Task.

Deep Reinforcement Learning for Mention-Ranking Coreference Models

TL;DR

<3-5 sentence high-level summary> The paper addresses the challenge of tuning heuristic losses for coreference resolution by directly optimizing coreference evaluation metrics through reinforcement learning. It compares a REINFORCE approach with a reward-rescaled max-margin objective within a neural mention-ranking framework, finding reward-rescaling to be superior. Experiments on English and Chinese portions of CoNLL-2012 show significant gains over prior state-of-the-art, attributed to direct metric optimization and adaptive, reward-based cost shaping. The work demonstrates that minimizing reliance on manually tuned hyperparameters can improve cross-language coreference performance and scalability.

Abstract

Coreference resolution systems are typically trained with heuristic loss functions that require careful tuning. In this paper we instead apply reinforcement learning to directly optimize a neural mention-ranking model for coreference evaluation metrics. We experiment with two approaches: the REINFORCE policy gradient algorithm and a reward-rescaled max-margin objective. We find the latter to be more effective, resulting in significant improvements over the current state-of-the-art on the English and Chinese portions of the CoNLL 2012 Shared Task.

Paper Structure

This paper contains 10 sections, 9 equations, 1 figure, 3 tables.

Figures (1)

  • Figure 1: Density plot of the costs $\Delta_{r}$ associated with different error types on the English CoNLL 2012 test set.