Table of Contents
Fetching ...

CARE: Co-Attention Network for Joint Entity and Relation Extraction

Wenjun Kong, Yamei Xia

TL;DR

This paper targets joint entity and relation extraction by addressing two core challenges: feature confusion from shared representations and limited interaction between NER and RE. It introduces CARE, a co-attention network with three modules (encoder, co-attention, and classification) that uses parallel encoding to learn task-specific representations and a co-attention mechanism to enable bidirectional information flow between NER and RE. CARE models both tasks as table-filling problems and optimizes a joint loss, enabling entities to inform relation predictions and vice versa. Extensive experiments on NYT, WebNLG, and SciERC show that CARE consistently improves over state-of-the-art baselines, with ablations confirming the effectiveness of the distance embeddings, shared representations, and multi-layer co-attention in producing robust improvements for both subtasks.

Abstract

Joint entity and relation extraction is the fundamental task of information extraction, consisting of two subtasks: named entity recognition and relation extraction. However, most existing joint extraction methods suffer from issues of feature confusion or inadequate interaction between the two subtasks. Addressing these challenges, in this work, we propose a Co-Attention network for joint entity and Relation Extraction (CARE). Our approach includes adopting a parallel encoding strategy to learn separate representations for each subtask, aiming to avoid feature overlap or confusion. At the core of our approach is the co-attention module that captures two-way interaction between the two subtasks, allowing the model to leverage entity information for relation prediction and vice versa, thus promoting mutual enhancement. Through extensive experiments on three benchmark datasets for joint entity and relation extraction (NYT, WebNLG, and SciERC), we demonstrate that our proposed model outperforms existing baseline models. Our code will be available at https://github.com/kwj0x7f/CARE.

CARE: Co-Attention Network for Joint Entity and Relation Extraction

TL;DR

This paper targets joint entity and relation extraction by addressing two core challenges: feature confusion from shared representations and limited interaction between NER and RE. It introduces CARE, a co-attention network with three modules (encoder, co-attention, and classification) that uses parallel encoding to learn task-specific representations and a co-attention mechanism to enable bidirectional information flow between NER and RE. CARE models both tasks as table-filling problems and optimizes a joint loss, enabling entities to inform relation predictions and vice versa. Extensive experiments on NYT, WebNLG, and SciERC show that CARE consistently improves over state-of-the-art baselines, with ablations confirming the effectiveness of the distance embeddings, shared representations, and multi-layer co-attention in producing robust improvements for both subtasks.

Abstract

Joint entity and relation extraction is the fundamental task of information extraction, consisting of two subtasks: named entity recognition and relation extraction. However, most existing joint extraction methods suffer from issues of feature confusion or inadequate interaction between the two subtasks. Addressing these challenges, in this work, we propose a Co-Attention network for joint entity and Relation Extraction (CARE). Our approach includes adopting a parallel encoding strategy to learn separate representations for each subtask, aiming to avoid feature overlap or confusion. At the core of our approach is the co-attention module that captures two-way interaction between the two subtasks, allowing the model to leverage entity information for relation prediction and vice versa, thus promoting mutual enhancement. Through extensive experiments on three benchmark datasets for joint entity and relation extraction (NYT, WebNLG, and SciERC), we demonstrate that our proposed model outperforms existing baseline models. Our code will be available at https://github.com/kwj0x7f/CARE.
Paper Structure (23 sections, 12 equations, 3 figures, 3 tables)

This paper contains 23 sections, 12 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: An example from the SciERC dataset luan2018multi. Entities are distinguished by being assigned various colors.
  • Figure 2: The overall framework of CARE, which consists of three modules: encoder module, co-attention module, and classification module. NER-specified and RE-specified representations are colored as green and red respectively.
  • Figure 3: Case study on the SciERC test set. Entities are distinguished by being assigned various colors.