AutoRE: Document-Level Relation Extraction with Large Language Models

Lilong Xue; Dan Zhang; Yuxiao Dong; Jie Tang

AutoRE: Document-Level Relation Extraction with Large Language Models

Lilong Xue, Dan Zhang, Yuxiao Dong, Jie Tang

TL;DR

AutoRE introduces the Relation-Head-Facts (RHF) paradigm for document-level relation extraction and implements it as three PEFT-based QLoRA modules on a Mistral-7B backbone. By decomposing extraction into relation, head, and fact steps with instruction-tuning templates, AutoRE achieves state-of-the-art results on Re-DocRED and demonstrates strong cross-model generalization. The approach addresses DocRE-specific challenges in handling numerous relations and multiple triplets across documents, while remaining computationally efficient. Limitations include reliance on a fixed relation vocabulary and in-domain training data; future work will broaden relation coverage and unseen-relations handling, with code and demo publicly available.

Abstract

Large Language Models (LLMs) have demonstrated exceptional abilities in comprehending and generating text, motivating numerous researchers to utilize them for Information Extraction (IE) purposes, including Relation Extraction (RE). Nonetheless, most existing methods are predominantly designed for Sentence-level Relation Extraction (SentRE) tasks, which typically encompass a restricted set of relations and triplet facts within a single sentence. Furthermore, certain approaches resort to treating relations as candidate choices integrated into prompt templates, leading to inefficient processing and suboptimal performance when tackling Document-Level Relation Extraction (DocRE) tasks, which entail handling multiple relations and triplet facts distributed across a given document, posing distinct challenges. To overcome these limitations, we introduce AutoRE, an end-to-end DocRE model that adopts a novel RE extraction paradigm named RHF (Relation-Head-Facts). Unlike existing approaches, AutoRE does not rely on the assumption of known relation options, making it more reflective of real-world scenarios. Additionally, we have developed an easily extensible RE framework using a Parameters Efficient Fine Tuning (PEFT) algorithm (QLoRA). Our experiments on the RE-DocRED dataset showcase AutoRE's best performance, achieving state-of-the-art results, surpassing TAG by 10.03\% and 9.03\% respectively on the dev and test set. The code is available at https://github.com/THUDM/AutoRE and the demonstration video is provided at https://www.youtube.com/watch?v=IhKRsZUAxKk.

AutoRE: Document-Level Relation Extraction with Large Language Models

TL;DR

Abstract

AutoRE: Document-Level Relation Extraction with Large Language Models

Authors

TL;DR

Abstract

Table of Contents

Figures (4)