Table of Contents
Fetching ...

Few-shot Named Entity Recognition via Superposition Concept Discrimination

Jiawei Chen, Hongyu Lin, Xianpei Han, Yaojie Lu, Shanshan Jiang, Bin Dong, Le Sun

TL;DR

Few-shot NER suffers from intrinsic precise generalization challenges when only a handful of illustrative instances are available. The authors introduce SuperCD, an active-learning framework that identifies superposition concepts with a Concept Extractor (CE) and retrieves high-value examples via a Superposition Instance Retriever (SIR) from large corpora; annotated instances, together with the originals, train FS-NER models. CE/SIR are trained on large, accessible resources (Wikipedia/Wikidata) to learn universal concept extraction and retrieval, and a contrastive loss guides SIR to align queries with correct instances. Across five datasets and diverse FS-NER architectures, SuperCD yields substantial improvements with minimal annotation budgets, demonstrating robust, model-agnostic enhancement of precise generalization in FS-NER. This approach enables targeted generalization knowledge infusion with practical annotation costs, broadening the applicability of few-shot NER in open-domain settings.

Abstract

Few-shot NER aims to identify entities of target types with only limited number of illustrative instances. Unfortunately, few-shot NER is severely challenged by the intrinsic precise generalization problem, i.e., it is hard to accurately determine the desired target type due to the ambiguity stemming from information deficiency. In this paper, we propose Superposition Concept Discriminator (SuperCD), which resolves the above challenge via an active learning paradigm. Specifically, a concept extractor is first introduced to identify superposition concepts from illustrative instances, with each concept corresponding to a possible generalization boundary. Then a superposition instance retriever is applied to retrieve corresponding instances of these superposition concepts from large-scale text corpus. Finally, annotators are asked to annotate the retrieved instances and these annotated instances together with original illustrative instances are used to learn FS-NER models. To this end, we learn a universal concept extractor and superposition instance retriever using a large-scale openly available knowledge bases. Experiments show that SuperCD can effectively identify superposition concepts from illustrative instances, retrieve superposition instances from large-scale corpus, and significantly improve the few-shot NER performance with minimal additional efforts.

Few-shot Named Entity Recognition via Superposition Concept Discrimination

TL;DR

Few-shot NER suffers from intrinsic precise generalization challenges when only a handful of illustrative instances are available. The authors introduce SuperCD, an active-learning framework that identifies superposition concepts with a Concept Extractor (CE) and retrieves high-value examples via a Superposition Instance Retriever (SIR) from large corpora; annotated instances, together with the originals, train FS-NER models. CE/SIR are trained on large, accessible resources (Wikipedia/Wikidata) to learn universal concept extraction and retrieval, and a contrastive loss guides SIR to align queries with correct instances. Across five datasets and diverse FS-NER architectures, SuperCD yields substantial improvements with minimal annotation budgets, demonstrating robust, model-agnostic enhancement of precise generalization in FS-NER. This approach enables targeted generalization knowledge infusion with practical annotation costs, broadening the applicability of few-shot NER in open-domain settings.

Abstract

Few-shot NER aims to identify entities of target types with only limited number of illustrative instances. Unfortunately, few-shot NER is severely challenged by the intrinsic precise generalization problem, i.e., it is hard to accurately determine the desired target type due to the ambiguity stemming from information deficiency. In this paper, we propose Superposition Concept Discriminator (SuperCD), which resolves the above challenge via an active learning paradigm. Specifically, a concept extractor is first introduced to identify superposition concepts from illustrative instances, with each concept corresponding to a possible generalization boundary. Then a superposition instance retriever is applied to retrieve corresponding instances of these superposition concepts from large-scale text corpus. Finally, annotators are asked to annotate the retrieved instances and these annotated instances together with original illustrative instances are used to learn FS-NER models. To this end, we learn a universal concept extractor and superposition instance retriever using a large-scale openly available knowledge bases. Experiments show that SuperCD can effectively identify superposition concepts from illustrative instances, retrieve superposition instances from large-scale corpus, and significantly improve the few-shot NER performance with minimal additional efforts.
Paper Structure (24 sections, 4 equations, 7 figures, 3 tables)

This paper contains 24 sections, 4 equations, 7 figures, 3 tables.

Figures (7)

  • Figure 1: Examples of the precise generalization challenge. The underline indicates the annotated entity mentions. Given only the illustrative instances, the desired target type may be University, Educational institution, Research institution or Organization. Discriminating superposition concepts like High school, Academy of sciences and Sports organization helps determine what the desirable target entity type is.
  • Figure 2: Overview of SuperCD. The underline indicates the annotated entity mentions. SuperCD first extract sets of superposition concepts and then retrieve corresponding instances. Finally, by annotating that Inderkum High School is the target type while NAE (Research institution) and WTA (Sports organization) is not and using these instances to learn model, the generalization knowledge is injected into the FS-NER model.
  • Figure 3: The process of superposition concept extraction. Sets of superposition concepts are constructed through the concepts of few-shot illustrative instances with the "A but not B" manner.
  • Figure 4: An example of query generation in dataset construction.
  • Figure 5: The micro-F1 scores of BERT on entities with unseen concepts in the test set.
  • ...and 2 more figures