COMM:Concentrated Margin Maximization for Robust Document-Level Relation Extraction

Zhichao Duan; Tengyu Pan; Zhenyu Li; Xiuxing Li; Jianyong Wang

COMM:Concentrated Margin Maximization for Robust Document-Level Relation Extraction

Zhichao Duan, Tengyu Pan, Zhenyu Li, Xiuxing Li, Jianyong Wang

TL;DR

This work tackles document-level relation extraction under challenging data conditions, including noisy labels and extreme class imbalance. It introduces COMM, a two-stage framework that combines instance-aware reasoning augmentation (IARA) with margin-centered optimization, featuring the Concentrated Margin Maximization (CMM) loss and an adaptive Threshold Class (TH). By aligning logits with a dynamic threshold and reweighting margins to emphasize hard positives and suppress easy negatives, COMM achieves robust improvements across DocRED and Re-DocRED, especially when trained on lower-quality data. The results demonstrate the practical impact of margin-focused optimization for DocRE and establish COMM as a versatile, data-aware enhancement to existing relational encoders.

Abstract

Document-level relation extraction (DocRE) is the process of identifying and extracting relations between entities that span multiple sentences within a document. Due to its realistic settings, DocRE has garnered increasing research attention in recent years. Previous research has mostly focused on developing sophisticated encoding models to better capture the intricate patterns between entity pairs. While these advancements are undoubtedly crucial, an even more foundational challenge lies in the data itself. The complexity inherent in DocRE makes the labeling process prone to errors, compounded by the extreme sparsity of positive relation samples, which is driven by both the limited availability of positive instances and the broad diversity of positive relation types. These factors can lead to biased optimization processes, further complicating the task of accurate relation extraction. Recognizing these challenges, we have developed a robust framework called \textit{\textbf{COMM}} to better solve DocRE. \textit{\textbf{COMM}} operates by initially employing an instance-aware reasoning method to dynamically capture pertinent information of entity pairs within the document and extract relational features. Following this, \textit{\textbf{COMM}} takes into account the distribution of relations and the difficulty of samples to dynamically adjust the margins between prediction logits and the decision threshold, a process we call Concentrated Margin Maximization. In this way, \textit{\textbf{COMM}} not only enhances the extraction of relevant relational features but also boosts DocRE performance by addressing the specific challenges posed by the data. Extensive experiments and analysis demonstrate the versatility and effectiveness of \textit{\textbf{COMM}}, especially its robustness when trained on low-quality data (achieves \textgreater 10\% performance gains).

COMM:Concentrated Margin Maximization for Robust Document-Level Relation Extraction

TL;DR

Abstract

COMM:Concentrated Margin Maximization for Robust Document-Level Relation Extraction

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (4)