Table of Contents
Fetching ...

GEGA: Graph Convolutional Networks and Evidence Retrieval Guided Attention for Enhanced Document-level Relation Extraction

Yanxu Mao, Xiaohui Chen, Peipei Liu, Tiehan Cui, Zuhui Yue, Zheng Li

TL;DR

This work proposes GEGA, a novel model for DocRE that leverages graph neural networks to construct multiple weight matrices, guiding attention allocation to evidence sentences and employs multi-scale representation aggregation to enhance ER.

Abstract

Document-level relation extraction (DocRE) aims to extract relations between entities from unstructured document text. Compared to sentence-level relation extraction, it requires more complex semantic understanding from a broader text context. Currently, some studies are utilizing logical rules within evidence sentences to enhance the performance of DocRE. However, in the data without provided evidence sentences, researchers often obtain a list of evidence sentences for the entire document through evidence retrieval (ER). Therefore, DocRE suffers from two challenges: firstly, the relevance between evidence and entity pairs is weak; secondly, there is insufficient extraction of complex cross-relations between long-distance multi-entities. To overcome these challenges, we propose GEGA, a novel model for DocRE. The model leverages graph neural networks to construct multiple weight matrices, guiding attention allocation to evidence sentences. It also employs multi-scale representation aggregation to enhance ER. Subsequently, we integrate the most efficient evidence information to implement both fully supervised and weakly supervised training processes for the model. We evaluate the GEGA model on three widely used benchmark datasets: DocRED, Re-DocRED, and Revisit-DocRED. The experimental results indicate that our model has achieved comprehensive improvements compared to the existing SOTA model.

GEGA: Graph Convolutional Networks and Evidence Retrieval Guided Attention for Enhanced Document-level Relation Extraction

TL;DR

This work proposes GEGA, a novel model for DocRE that leverages graph neural networks to construct multiple weight matrices, guiding attention allocation to evidence sentences and employs multi-scale representation aggregation to enhance ER.

Abstract

Document-level relation extraction (DocRE) aims to extract relations between entities from unstructured document text. Compared to sentence-level relation extraction, it requires more complex semantic understanding from a broader text context. Currently, some studies are utilizing logical rules within evidence sentences to enhance the performance of DocRE. However, in the data without provided evidence sentences, researchers often obtain a list of evidence sentences for the entire document through evidence retrieval (ER). Therefore, DocRE suffers from two challenges: firstly, the relevance between evidence and entity pairs is weak; secondly, there is insufficient extraction of complex cross-relations between long-distance multi-entities. To overcome these challenges, we propose GEGA, a novel model for DocRE. The model leverages graph neural networks to construct multiple weight matrices, guiding attention allocation to evidence sentences. It also employs multi-scale representation aggregation to enhance ER. Subsequently, we integrate the most efficient evidence information to implement both fully supervised and weakly supervised training processes for the model. We evaluate the GEGA model on three widely used benchmark datasets: DocRED, Re-DocRED, and Revisit-DocRED. The experimental results indicate that our model has achieved comprehensive improvements compared to the existing SOTA model.
Paper Structure (33 sections, 16 equations, 7 figures, 5 tables)

This paper contains 33 sections, 16 equations, 7 figures, 5 tables.

Figures (7)

  • Figure 1: Examples of relations from DocRED, with entities marked in different colors, and curves indicating various relations between the entities.
  • Figure 2: The overall architecture of our method. The gray circles with different depths belong to different sentences, and the color depth of the square is the basis to distinguish the attention weight score.
  • Figure 3: The overall architecture diagram of Multi-GraphConv (M-G) Layer includes three sub layers, each containing $n$ heads.
  • Figure 4: Step diagram of Co-prediction for RE and ER.
  • Figure 5: Loss value variation of the GEGA model trained on the DocRED dataset
  • ...and 2 more figures