Synergistic Anchored Contrastive Pre-training for Few-Shot Relation Extraction
Da Luo, Yanglei Gan, Rui Hou, Run Lin, Qiao Liu, Yuxiang Cai, Wannian Gao
TL;DR
This paper tackles the few-shot relation extraction problem under data scarcity by introducing SaCon, a synergistic anchored contrastive pre-training framework that jointly learns sentence- and label-space representations via multi-view contrasts. The approach employs a dual-encoder architecture and a knowledge-base-driven pre-training regime with symmetrical sentence- and label-anchored contrastive losses plus MLM, yielding robust, aligned representations that transfer well to FewRel benchmarks and show domain-transfer and zero-shot resilience. Comprehensive experiments demonstrate consistent gains over strong FSRE baselines and existing contrastive pre-training methods, with ablations confirming the complementary value of the two contrastive views. The work provides a scalable, publicly available framework for improving FSRE pre-training and emphasizes the practical impact on robust relation extraction across domains and zero-shot settings.
Abstract
Few-shot Relation Extraction (FSRE) aims to extract relational facts from a sparse set of labeled corpora. Recent studies have shown promising results in FSRE by employing Pre-trained Language Models (PLMs) within the framework of supervised contrastive learning, which considers both instances and label facts. However, how to effectively harness massive instance-label pairs to encompass the learned representation with semantic richness in this learning paradigm is not fully explored. To address this gap, we introduce a novel synergistic anchored contrastive pre-training framework. This framework is motivated by the insight that the diverse viewpoints conveyed through instance-label pairs capture incomplete yet complementary intrinsic textual semantics. Specifically, our framework involves a symmetrical contrastive objective that encompasses both sentence-anchored and label-anchored contrastive losses. By combining these two losses, the model establishes a robust and uniform representation space. This space effectively captures the reciprocal alignment of feature distributions among instances and relational facts, simultaneously enhancing the maximization of mutual information across diverse perspectives within the same relation. Experimental results demonstrate that our framework achieves significant performance enhancements compared to baseline models in downstream FSRE tasks. Furthermore, our approach exhibits superior adaptability to handle the challenges of domain shift and zero-shot relation extraction. Our code is available online at https://github.com/AONE-NLP/FSRE-SaCon.
