Few-shot Named Entity Recognition via Superposition Concept Discrimination
Jiawei Chen, Hongyu Lin, Xianpei Han, Yaojie Lu, Shanshan Jiang, Bin Dong, Le Sun
TL;DR
Few-shot NER suffers from intrinsic precise generalization challenges when only a handful of illustrative instances are available. The authors introduce SuperCD, an active-learning framework that identifies superposition concepts with a Concept Extractor (CE) and retrieves high-value examples via a Superposition Instance Retriever (SIR) from large corpora; annotated instances, together with the originals, train FS-NER models. CE/SIR are trained on large, accessible resources (Wikipedia/Wikidata) to learn universal concept extraction and retrieval, and a contrastive loss guides SIR to align queries with correct instances. Across five datasets and diverse FS-NER architectures, SuperCD yields substantial improvements with minimal annotation budgets, demonstrating robust, model-agnostic enhancement of precise generalization in FS-NER. This approach enables targeted generalization knowledge infusion with practical annotation costs, broadening the applicability of few-shot NER in open-domain settings.
Abstract
Few-shot NER aims to identify entities of target types with only limited number of illustrative instances. Unfortunately, few-shot NER is severely challenged by the intrinsic precise generalization problem, i.e., it is hard to accurately determine the desired target type due to the ambiguity stemming from information deficiency. In this paper, we propose Superposition Concept Discriminator (SuperCD), which resolves the above challenge via an active learning paradigm. Specifically, a concept extractor is first introduced to identify superposition concepts from illustrative instances, with each concept corresponding to a possible generalization boundary. Then a superposition instance retriever is applied to retrieve corresponding instances of these superposition concepts from large-scale text corpus. Finally, annotators are asked to annotate the retrieved instances and these annotated instances together with original illustrative instances are used to learn FS-NER models. To this end, we learn a universal concept extractor and superposition instance retriever using a large-scale openly available knowledge bases. Experiments show that SuperCD can effectively identify superposition concepts from illustrative instances, retrieve superposition instances from large-scale corpus, and significantly improve the few-shot NER performance with minimal additional efforts.
