A Unified Label-Aware Contrastive Learning Framework for Few-Shot Named Entity Recognition
Haojie Zhang, Yimeng Zhuang
TL;DR
The paper addresses the challenge of few-shot NER by combining label semantics with contrastive learning. It introduces a unified framework that appends natural-language label suffixes to context prompts and optimizes both context-context and context-label contrasts, using a projection head to map to Gaussian embeddings and enabling nearest-neighbor inference at test. Empirical results across OntoNotes, CoNLL'03, WNUT'17, GUM, I2B2, and FEW-NERD show state-of-the-art micro-F1 gains with strong transfer performance and robust contextual representations. Ablation and visualization analyses attribute the gains to improved discriminative context representations and effective use of label semantics, demonstrating the method’s versatility and potential for extension to other token-level tasks and zero-shot scenarios.
Abstract
Few-shot Named Entity Recognition (NER) aims to extract named entities using only a limited number of labeled examples. Existing contrastive learning methods often suffer from insufficient distinguishability in context vector representation because they either solely rely on label semantics or completely disregard them. To tackle this issue, we propose a unified label-aware token-level contrastive learning framework. Our approach enriches the context by utilizing label semantics as suffix prompts. Additionally, it simultaneously optimizes context-context and context-label contrastive learning objectives to enhance generalized discriminative contextual representations.Extensive experiments on various traditional test domains (OntoNotes, CoNLL'03, WNUT'17, GUM, I2B2) and the large-scale few-shot NER dataset (FEWNERD) demonstrate the effectiveness of our approach. It outperforms prior state-of-the-art models by a significant margin, achieving an average absolute gain of 7% in micro F1 scores across most scenarios. Further analysis reveals that our model benefits from its powerful transfer capability and improved contextual representations.
