Hybrid Multi-stage Decoding for Few-shot NER with Entity-aware Contrastive Learning
Congying Liu, Gaosheng Wang, Peipei Liu, Xingyuan Wei, Hongsong Zhu
TL;DR
This work tackles the bottlenecks of few-shot NER caused by abundant negative spans and high computational costs in token- or span-level approaches. It introduces MsFNER, a hybrid multi-stage decoding framework that splits NER into entity-span detection (ESD) and entity classification (EC), with meta-learning-based training on a source domain and finetuning on a target domain. EC is enhanced by entity-aware contrastive learning and prototype-based classification, while inference combines EC with KNN augmentation to robustly predict entity types. Experiments on FewNERD and FewAPTER show state-of-the-art performance and favorable efficiency, with MsFNER often outperforming specialized baselines and offering competitive results against ChatGPT on domain-adaptive tasks, highlighting its practical impact for fast, adaptable NER in low-resource scenarios.
Abstract
Few-shot named entity recognition can identify new types of named entities based on a few labeled examples. Previous methods employing token-level or span-level metric learning suffer from the computational burden and a large number of negative sample spans. In this paper, we propose the Hybrid Multi-stage Decoding for Few-shot NER with Entity-aware Contrastive Learning (MsFNER), which splits the general NER into two stages: entity-span detection and entity classification. There are 3 processes for introducing MsFNER: training, finetuning, and inference. In the training process, we train and get the best entity-span detection model and the entity classification model separately on the source domain using meta-learning, where we create a contrastive learning module to enhance entity representations for entity classification. During finetuning, we finetune the both models on the support dataset of target domain. In the inference process, for the unlabeled data, we first detect the entity-spans, then the entity-spans are jointly determined by the entity classification model and the KNN. We conduct experiments on the open FewNERD dataset and the results demonstrate the advance of MsFNER.
