Cross-domain Named Entity Recognition via Graph Matching
Junhao Zheng, Haibin Chen, Qianli Ma
TL;DR
This work tackles cross-domain NER under data scarcity by introducing Label Structure Transfer for cross-domain NER (LST-NER), which builds label graphs for both source and target label spaces from model predictions and aligns them with a Gromov-Wasserstein distance-based graph matching. It enhances contextual representations by fusing label-graph semantics into BERT embeddings through a label-guided attention mechanism and GCN, coupled with an auxiliary multi-label task. The method yields robust improvements over transfer-learning, multi-task, and few-shot baselines across eight domains in both rich- and low-resource regimes, and benefits further when combined with domain-adaptive pre-training. These results demonstrate that explicit modeling and transfer of label structure can effectively bridge domain gaps and may generalize to other cross-domain prediction tasks.
Abstract
Cross-domain NER is a practical yet challenging problem since the data scarcity in the real-world scenario. A common practice is first to learn a NER model in a rich-resource general domain and then adapt the model to specific domains. Due to the mismatch problem between entity types across domains, the wide knowledge in the general domain can not effectively transfer to the target domain NER model. To this end, we model the label relationship as a probability distribution and construct label graphs in both source and target label spaces. To enhance the contextual representation with label structures, we fuse the label graph into the word embedding output by BERT. By representing label relationships as graphs, we formulate cross-domain NER as a graph matching problem. Furthermore, the proposed method has good applicability with pre-training methods and is potentially capable of other cross-domain prediction tasks. Empirical results on four datasets show that our method outperforms a series of transfer learning, multi-task learning, and few-shot learning methods.
