Few-Shot, No Problem: Descriptive Continual Relation Extraction
Nguyen Xuan Thanh, Anh Duc Le, Quyen Tran, Thanh-Thien Le, Linh Ngo Van, Thien Huu Nguyen
TL;DR
The paper addresses Few-Shot Continual Relation Extraction (FCRE) under sequential tasks with limited samples. It proposes a retrieval-based framework that uses Large Language Model (LLM) generated relation descriptions as stable class prototypes, a bi-encoder for joint sample/class representation learning, and a Descriptive Retrieval Inference (DRI) mechanism to unify prototype proximity and description semantics. The learning objective combines sample-space losses (Supervised Contrastive and Hard Soft Margin) with description-centered losses (Hard Margin and Mutual Information) in a joint training objective, and a memory-based rehearsal strategy preserves prior knowledge. Empirical results on FewRel and TACRED with multiple backbones demonstrate state-of-the-art accuracy and robust forgetting resistance, highlighting the value of descriptive grounding and retrieval for dynamic relation extraction.
Abstract
Few-shot Continual Relation Extraction is a crucial challenge for enabling AI systems to identify and adapt to evolving relationships in dynamic real-world domains. Traditional memory-based approaches often overfit to limited samples, failing to reinforce old knowledge, with the scarcity of data in few-shot scenarios further exacerbating these issues by hindering effective data augmentation in the latent space. In this paper, we propose a novel retrieval-based solution, starting with a large language model to generate descriptions for each relation. From these descriptions, we introduce a bi-encoder retrieval training paradigm to enrich both sample and class representation learning. Leveraging these enhanced representations, we design a retrieval-based prediction method where each sample "retrieves" the best fitting relation via a reciprocal rank fusion score that integrates both relation description vectors and class prototypes. Extensive experiments on multiple datasets demonstrate that our method significantly advances the state-of-the-art by maintaining robust performance across sequential tasks, effectively addressing catastrophic forgetting.
