Template-Based Named Entity Recognition Using BART
Leyang Cui, Yu Wu, Jian Liu, Sen Yang, Yue Zhang
TL;DR
The paper tackles few-shot NER under cross-domain label-set mismatch by reframing NER as a language-model ranking task and using templates to generate span-centered outputs. It introduces a template-based BART framework that trains on positive (X,T^+) and negative (X,T^−) template pairs and performs span-level inference by scoring decoded templates. Across CoNLL03 and cross-domain benchmarks (MIT Movie/Restaurant, ATIS), the method achieves strong results, including a state-of-the-art-like 92.55% F1 on CoNLL03 with template ensembles and notable improvements in few-shot and cross-domain settings. The approach offers robust transfer learning, domain-style robustness, and the ability to accommodate new entity types without changing the output layer, with publicly released code for reproducibility.
Abstract
There is a recent interest in investigating few-shot NER, where the low-resource target domain has different label sets compared with a resource-rich source domain. Existing methods use a similarity-based metric. However, they cannot make full use of knowledge transfer in NER model parameters. To address the issue, we propose a template-based method for NER, treating NER as a language model ranking problem in a sequence-to-sequence framework, where original sentences and statement templates filled by candidate named entity span are regarded as the source sequence and the target sequence, respectively. For inference, the model is required to classify each candidate span based on the corresponding template scores. Our experiments demonstrate that the proposed method achieves 92.55% F1 score on the CoNLL03 (rich-resource task), and significantly better than fine-tuning BERT 10.88%, 15.34%, and 11.73% F1 score on the MIT Movie, the MIT Restaurant, and the ATIS (low-resource task), respectively.
