A Unified Label-Aware Contrastive Learning Framework for Few-Shot Named Entity Recognition

Haojie Zhang; Yimeng Zhuang

A Unified Label-Aware Contrastive Learning Framework for Few-Shot Named Entity Recognition

Haojie Zhang, Yimeng Zhuang

TL;DR

The paper addresses the challenge of few-shot NER by combining label semantics with contrastive learning. It introduces a unified framework that appends natural-language label suffixes to context prompts and optimizes both context-context and context-label contrasts, using a projection head to map to Gaussian embeddings and enabling nearest-neighbor inference at test. Empirical results across OntoNotes, CoNLL'03, WNUT'17, GUM, I2B2, and FEW-NERD show state-of-the-art micro-F1 gains with strong transfer performance and robust contextual representations. Ablation and visualization analyses attribute the gains to improved discriminative context representations and effective use of label semantics, demonstrating the method’s versatility and potential for extension to other token-level tasks and zero-shot scenarios.

Abstract

Few-shot Named Entity Recognition (NER) aims to extract named entities using only a limited number of labeled examples. Existing contrastive learning methods often suffer from insufficient distinguishability in context vector representation because they either solely rely on label semantics or completely disregard them. To tackle this issue, we propose a unified label-aware token-level contrastive learning framework. Our approach enriches the context by utilizing label semantics as suffix prompts. Additionally, it simultaneously optimizes context-context and context-label contrastive learning objectives to enhance generalized discriminative contextual representations.Extensive experiments on various traditional test domains (OntoNotes, CoNLL'03, WNUT'17, GUM, I2B2) and the large-scale few-shot NER dataset (FEWNERD) demonstrate the effectiveness of our approach. It outperforms prior state-of-the-art models by a significant margin, achieving an average absolute gain of 7% in micro F1 scores across most scenarios. Further analysis reveals that our model benefits from its powerful transfer capability and improved contextual representations.

A Unified Label-Aware Contrastive Learning Framework for Few-Shot Named Entity Recognition

TL;DR

Abstract

Paper Structure (32 sections, 10 equations, 2 figures, 9 tables, 1 algorithm)

This paper contains 32 sections, 10 equations, 2 figures, 9 tables, 1 algorithm.

Introduction
Related Work
Meta Learning
Prompt Technology
Few-shot NER
Problem Formulation
Few-Shot NER
Evaluation Protocols
Tagging Scheme
Method
Source Domain Training
Fine-tuning in the Target Domain
Inference Process
Experiments Setups
Datasets
...and 17 more sections

Figures (2)

Figure 1: An overview of the architecture of our proposed model. (a) During the training and fine-tuning process in the source domain, the fine-tuning follows a similar approach as training, but with a different label prompt. Utilizing contrastive learning, tokens belonging to the same entity types are attracted toward each other, while tokens representing different entity types are pushed apart. This encourages the model to learn a more distinct and effective representation of entity-specific information. The contrastive learning includes two aspects: context-context and context-label. (b) Inference process with nearest neighbor prediction. Similarity scores between query tokens and support tokens will be calculated according to the distance metric.
Figure 2: Two-dimensional t-SNE visualizations of the FEW-NERD test set. The token representations are from the sampled 6 fine-grained entity types of location category. The left is for CONTaiNER and the right is for ours.

A Unified Label-Aware Contrastive Learning Framework for Few-Shot Named Entity Recognition

TL;DR

Abstract

A Unified Label-Aware Contrastive Learning Framework for Few-Shot Named Entity Recognition

Authors

TL;DR

Abstract

Table of Contents

Figures (2)