Augmenting Document-level Relation Extraction with Efficient Multi-Supervision

Xiangyu Lin; Weijia Jia; Zhiguo Gong

Augmenting Document-level Relation Extraction with Efficient Multi-Supervision

Xiangyu Lin, Weijia Jia, Zhiguo Gong

TL;DR

The paper addresses the inefficiency and noise inherent in leveraging distant supervision for document-level relation extraction. It introduces Efficient Multi-Supervision (EMS), which combines Document Informativeness Ranking (DIR) to selectively augment DS data with informative documents and Multi-Supervision Ranking-based Loss (MSRL) to robustly fuse distant, expert, and self supervision, mitigating label noise. On DocRED, EMS achieves competitive or superior F1 scores with dramatically reduced time costs compared to full-DS pretraining and other DS-based baselines, validated through ablations that demonstrate the necessity of DIR and MSRL. The approach offers a practical, scalable path to exploiting large DS datasets for DocRE in real-world settings, balancing accuracy and efficiency.

Abstract

Despite its popularity in sentence-level relation extraction, distantly supervised data is rarely utilized by existing work in document-level relation extraction due to its noisy nature and low information density. Among its current applications, distantly supervised data is mostly used as a whole for pertaining, which is of low time efficiency. To fill in the gap of efficient and robust utilization of distantly supervised training data, we propose Efficient Multi-Supervision for document-level relation extraction, in which we first select a subset of informative documents from the massive dataset by combining distant supervision with expert supervision, then train the model with Multi-Supervision Ranking Loss that integrates the knowledge from multiple sources of supervision to alleviate the effects of noise. The experiments demonstrate the effectiveness of our method in improving the model performance with higher time efficiency than existing baselines.

Augmenting Document-level Relation Extraction with Efficient Multi-Supervision

TL;DR

Abstract

Paper Structure (15 sections, 4 equations, 2 figures, 4 tables)

This paper contains 15 sections, 4 equations, 2 figures, 4 tables.

Introduction
Related Work
Methodology
Preliminary
Document Informativeness Ranking
Multi-Supervision Ranking-based Loss
Experiments
Datasets and Settings
Compared Baselines
Main Results
Ablation Study
Case Study
Conclusions
Limitations
Time Efficiency

Figures (2)

Figure 1: The illustration of EMS, contains two main components: DIR and MSRL. In MSRL, Agg. represents aggreements, Rec. represents recommendations and Oth. represents others.
Figure 2: A retrieved document with some representative instances. The numbers are the logit values of the relation classes after training, and "located in" is the abbreviation of relation class "located in the administrative territorial entity".

Augmenting Document-level Relation Extraction with Efficient Multi-Supervision

TL;DR

Abstract

Augmenting Document-level Relation Extraction with Efficient Multi-Supervision

Authors

TL;DR

Abstract

Table of Contents

Figures (2)