Table of Contents
Fetching ...

Synergistic Anchored Contrastive Pre-training for Few-Shot Relation Extraction

Da Luo, Yanglei Gan, Rui Hou, Run Lin, Qiao Liu, Yuxiang Cai, Wannian Gao

TL;DR

This paper tackles the few-shot relation extraction problem under data scarcity by introducing SaCon, a synergistic anchored contrastive pre-training framework that jointly learns sentence- and label-space representations via multi-view contrasts. The approach employs a dual-encoder architecture and a knowledge-base-driven pre-training regime with symmetrical sentence- and label-anchored contrastive losses plus MLM, yielding robust, aligned representations that transfer well to FewRel benchmarks and show domain-transfer and zero-shot resilience. Comprehensive experiments demonstrate consistent gains over strong FSRE baselines and existing contrastive pre-training methods, with ablations confirming the complementary value of the two contrastive views. The work provides a scalable, publicly available framework for improving FSRE pre-training and emphasizes the practical impact on robust relation extraction across domains and zero-shot settings.

Abstract

Few-shot Relation Extraction (FSRE) aims to extract relational facts from a sparse set of labeled corpora. Recent studies have shown promising results in FSRE by employing Pre-trained Language Models (PLMs) within the framework of supervised contrastive learning, which considers both instances and label facts. However, how to effectively harness massive instance-label pairs to encompass the learned representation with semantic richness in this learning paradigm is not fully explored. To address this gap, we introduce a novel synergistic anchored contrastive pre-training framework. This framework is motivated by the insight that the diverse viewpoints conveyed through instance-label pairs capture incomplete yet complementary intrinsic textual semantics. Specifically, our framework involves a symmetrical contrastive objective that encompasses both sentence-anchored and label-anchored contrastive losses. By combining these two losses, the model establishes a robust and uniform representation space. This space effectively captures the reciprocal alignment of feature distributions among instances and relational facts, simultaneously enhancing the maximization of mutual information across diverse perspectives within the same relation. Experimental results demonstrate that our framework achieves significant performance enhancements compared to baseline models in downstream FSRE tasks. Furthermore, our approach exhibits superior adaptability to handle the challenges of domain shift and zero-shot relation extraction. Our code is available online at https://github.com/AONE-NLP/FSRE-SaCon.

Synergistic Anchored Contrastive Pre-training for Few-Shot Relation Extraction

TL;DR

This paper tackles the few-shot relation extraction problem under data scarcity by introducing SaCon, a synergistic anchored contrastive pre-training framework that jointly learns sentence- and label-space representations via multi-view contrasts. The approach employs a dual-encoder architecture and a knowledge-base-driven pre-training regime with symmetrical sentence- and label-anchored contrastive losses plus MLM, yielding robust, aligned representations that transfer well to FewRel benchmarks and show domain-transfer and zero-shot resilience. Comprehensive experiments demonstrate consistent gains over strong FSRE baselines and existing contrastive pre-training methods, with ablations confirming the complementary value of the two contrastive views. The work provides a scalable, publicly available framework for improving FSRE pre-training and emphasizes the practical impact on robust relation extraction across domains and zero-shot settings.

Abstract

Few-shot Relation Extraction (FSRE) aims to extract relational facts from a sparse set of labeled corpora. Recent studies have shown promising results in FSRE by employing Pre-trained Language Models (PLMs) within the framework of supervised contrastive learning, which considers both instances and label facts. However, how to effectively harness massive instance-label pairs to encompass the learned representation with semantic richness in this learning paradigm is not fully explored. To address this gap, we introduce a novel synergistic anchored contrastive pre-training framework. This framework is motivated by the insight that the diverse viewpoints conveyed through instance-label pairs capture incomplete yet complementary intrinsic textual semantics. Specifically, our framework involves a symmetrical contrastive objective that encompasses both sentence-anchored and label-anchored contrastive losses. By combining these two losses, the model establishes a robust and uniform representation space. This space effectively captures the reciprocal alignment of feature distributions among instances and relational facts, simultaneously enhancing the maximization of mutual information across diverse perspectives within the same relation. Experimental results demonstrate that our framework achieves significant performance enhancements compared to baseline models in downstream FSRE tasks. Furthermore, our approach exhibits superior adaptability to handle the challenges of domain shift and zero-shot relation extraction. Our code is available online at https://github.com/AONE-NLP/FSRE-SaCon.
Paper Structure (21 sections, 8 equations, 4 figures, 4 tables)

This paper contains 21 sections, 8 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: The concepts of single-view CL and multi-view CL. Each color corresponds to a distinct relation, with circles and triangles symbolizing instances and labels. Solid lines denote proximity between instances (a) or instance-label pair (b), while dashed lines indicate instances moving apart (a) or mismatched instance-label pairs (b).
  • Figure 2: The model overview of SaCon. The SaCon framework involves simultaneous training of a label encoder and a sentence encoder during the pre-training stage to predict accurate pairings (depicted in green) for a batch of [label, sentence] training examples. In the subsequent fine-tuning stage, the adeptly trained label encoder and sentence encoder handle tasks like few-shot relation extraction or zero-shot relation extraction by effectively incorporating information from both labels and sentences.
  • Figure 3: Instance-level ($\bullet$) and Label-level ($\blacktriangle$) feature distribution plots of five sampled relations on unit hypersphere with five pre-training frameworks.
  • Figure 4: Mean statistics of alignment and uniformity. Lower values indicate better alignment and uniformity.