TTM-RE: Memory-Augmented Document-Level Relation Extraction

Chufan Gao; Xuan Wang; Jimeng Sun

TTM-RE: Memory-Augmented Document-Level Relation Extraction

Chufan Gao, Xuan Wang, Jimeng Sun

TL;DR

This work addresses document-level relation extraction under noisy, distantly supervised data by introducing TTM-RE, a memory-augmented architecture that leverages a Token Turing Machine to reprocess head and tail entity representations. Coupled with a noise-robust SSR-PU loss that accounts for prior shift in positive-unlabeled data, TTM-RE achieves state-of-the-art performance on ReDocRED and ChemDisGene, especially when using large-scale noisy training. Ablation studies show that memory tokens and deeper memory encoders provide tangible gains, improve rare-label handling, and remain effective across domains, though memory initialization and data size influence the extent of benefits. The approach demonstrates a practical pathway to exploit abundant noisy data for document-level RE, suggesting memory-augmented mechanisms as a promising direction beyond standard loss-function improvements.

Abstract

Document-level relation extraction aims to categorize the association between any two entities within a document. We find that previous methods for document-level relation extraction are ineffective in exploiting the full potential of large amounts of training data with varied noise levels. For example, in the ReDocRED benchmark dataset, state-of-the-art methods trained on the large-scale, lower-quality, distantly supervised training data generally do not perform better than those trained solely on the smaller, high-quality, human-annotated training data. To unlock the full potential of large-scale noisy training data for document-level relation extraction, we propose TTM-RE, a novel approach that integrates a trainable memory module, known as the Token Turing Machine, with a noisy-robust loss function that accounts for the positive-unlabeled setting. Extensive experiments on ReDocRED, a benchmark dataset for document-level relation extraction, reveal that TTM-RE achieves state-of-the-art performance (with an absolute F1 score improvement of over 3%). Ablation studies further illustrate the superiority of TTM-RE in other domains (the ChemDisGene dataset in the biomedical domain) and under highly unlabeled settings.

TTM-RE: Memory-Augmented Document-Level Relation Extraction

TL;DR

Abstract

Paper Structure (36 sections, 9 equations, 6 figures, 9 tables)

This paper contains 36 sections, 9 equations, 6 figures, 9 tables.

Introduction
Related Work
Document-level Relation Extraction
Memory-based Models in NLP
Methodology
Problem Definition
Token Turing Machines
Initializing Memory Tokens:
Reading from Memory:
Processing of Head and Tail Entities
Noise-Robust Loss Function (SSR-PU)
Experimental Settings
Datasets
ReDocRED
ChemDisGene
...and 21 more sections

Figures (6)

Figure 1: Differences between the generic document relation extraction approach and TTM-RE for document-level relation extraction. The memory module processes the input entities and outputs to the relation classifier. We investigate how adding the memory component affects performance (such as different datasets and memory sizes).
Figure 2: Sample document relation extraction document from DocRED yao2019docred. Here, the head entity is related to the tail entity by "P131: located in the administrative territorial entity".
Figure 3: Overall framework of TTM-RE. Given an example document and an expected relation distribution, we use an LLM (Roberta-Large) to encode the input tokens in a single pass and consider head and tail entities by their token representations, which are then fed into a memory module (in gray). The memory module then returns 2 memory-augmented versions of the head and tail entities for final relation classification.
Figure 4: Left Figure: Effect of the size of the number of layers in the memory encoder. More layers imply a more powerful memory module. Right Figure: Effect of the number of memory tokens (Memory Size) available to be used in TTM-RE on the test dataset of ReDocRED.
Figure 5: Plot of PCA-transformed head entities along with (200) memory entities. Tail entities are omitted due to redundancy.
...and 1 more figures

TTM-RE: Memory-Augmented Document-Level Relation Extraction

TL;DR

Abstract

TTM-RE: Memory-Augmented Document-Level Relation Extraction

Authors

TL;DR

Abstract

Table of Contents

Figures (6)