A Comprehensive Survey on Relation Extraction: Recent Advances and New Frontiers
Xiaoyan Zhao, Yang Deng, Min Yang, Lingzhi Wang, Rui Zhang, Hong Cheng, Wai Lam, Ying Shen, Ruifeng Xu
TL;DR
Relation extraction aims to identify structured relations between entities from unstructured text, formalized as predicting triplets $\langle head\_entity, relation, tail\_entity\rangle$. The paper surveys deep learning methods organized by text representation, context encoding, and triplet decoding, and reviews datasets, metrics, and domain-specific challenges. It highlights the dominance of Transformer-based models and PLMs, and discusses challenges such as low-resource, cross-sentence, multi-modal, temporal, and evolutionary RE, proposing directions including cross-lingual and explainable RE. Together, these insights provide a holistic map of current progress and practical guidance for building scalable, real-world RE systems.
Abstract
Relation extraction (RE) involves identifying the relations between entities from underlying content. RE serves as the foundation for many natural language processing (NLP) and information retrieval applications, such as knowledge graph completion and question answering. In recent years, deep neural networks have dominated the field of RE and made noticeable progress. Subsequently, the large pre-trained language models have taken the state-of-the-art RE to a new level. This survey provides a comprehensive review of existing deep learning techniques for RE. First, we introduce RE resources, including datasets and evaluation metrics. Second, we propose a new taxonomy to categorize existing works from three perspectives, i.e., text representation, context encoding, and triplet prediction. Third, we discuss several important challenges faced by RE and summarize potential techniques to tackle these challenges. Finally, we outline some promising future directions and prospects in this field. This survey is expected to facilitate researchers' collaborative efforts to address the challenges of real-world RE systems.
