mCL-NER: Cross-Lingual Named Entity Recognition via Multi-view Contrastive Learning

Ying Mo; Jian Yang; Jiahao Liu; Qifan Wang; Ruoyu Chen; Jingang Wang; Zhoujun Li

mCL-NER: Cross-Lingual Named Entity Recognition via Multi-view Contrastive Learning

Ying Mo, Jian Yang, Jiahao Liu, Qifan Wang, Ruoyu Chen, Jingang Wang, Zhoujun Li

TL;DR

Cross-lingual NER suffers from data scarcity and weak cross-language alignment, especially for low-resource languages. The paper introduces mCL-NER, a multi-view contrastive framework that pairs semantic alignment with token-to-token relation contrastive learning and augments training with code-switched data and self-training on unlabeled target data. By reformulating NER as token-pair relation classification and enforcing agreement in semantic and relational spaces, the approach achieves state-of-the-art results on XTREME-40 and CoNLL. This method improves cross-language entity transfer, particularly for distant languages, and demonstrates robustness via ablations and analyses of token-level representations.

Abstract

Cross-lingual named entity recognition (CrossNER) faces challenges stemming from uneven performance due to the scarcity of multilingual corpora, especially for non-English data. While prior efforts mainly focus on data-driven transfer methods, a significant aspect that has not been fully explored is aligning both semantic and token-level representations across diverse languages. In this paper, we propose Multi-view Contrastive Learning for Cross-lingual Named Entity Recognition (mCL-NER). Specifically, we reframe the CrossNER task into a problem of recognizing relationships between pairs of tokens. This approach taps into the inherent contextual nuances of token-to-token connections within entities, allowing us to align representations across different languages. A multi-view contrastive learning framework is introduced to encompass semantic contrasts between source, codeswitched, and target sentences, as well as contrasts among token-to-token relations. By enforcing agreement within both semantic and relational spaces, we minimize the gap between source sentences and their counterparts of both codeswitched and target sentences. This alignment extends to the relationships between diverse tokens, enhancing the projection of entities across languages. We further augment CrossNER by combining self-training with labeled source data and unlabeled target data. Our experiments on the XTREME benchmark, spanning 40 languages, demonstrate the superiority of mCL-NER over prior data-driven and model-based approaches. It achieves a substantial increase of nearly +2.0 $F_1$ scores across a broad spectrum and establishes itself as the new state-of-the-art performer.

mCL-NER: Cross-Lingual Named Entity Recognition via Multi-view Contrastive Learning

TL;DR

Abstract

scores across a broad spectrum and establishes itself as the new state-of-the-art performer.

Paper Structure (27 sections, 10 equations, 7 figures, 3 tables)

This paper contains 27 sections, 10 equations, 7 figures, 3 tables.

Introduction
Related work
Cross-lingual NER
Contrastive Learning
Problem Formulation
Cross-Lingual Multi-view Contrastive Learning
Semantic Contrastive Learning
Token-to-Token Relation Contrastive Learning
Self-Training
Experiments
Datasets
XTREME-40
CoNLL
Implementation Details and Evaluation
Main Results
...and 12 more sections

Figures (7)

Figure 1: Illustration of MCL-NER vs. existing methods TSL Multi_source_2020 and CROP CROP_Yang_2022. $D_{src}$, $D_{trans}$ and $D_{cds}$ are the source, translated-source and code-switched data respectively. $M_{*}$ represents the trained models from the corresponding data. Our model leverages multi-view contrastive learning to bridge the gap between cross-lingual semantic and token-to-token representations.
Figure 2: Overview of the proposed MCL-NER. It transforms the sequence labeling problem to the classification of the token pair relation with semantic contrastive learning and token-to-token relation contrastive learning.
Figure 3: Ablation study of MCL-NER. XTREME-Avg denotes the average F1 scores of 39 languages in XTREME.
Figure 4: \ref{['Fig.cl_relation_1']} denotes the representation on token-to-token relation without contrastive learning and \ref{['Fig.cl_relation_2']} is the representation on token-to-token relation with contrastive learning.
Figure 5: t-SNE visualization of different pre-defined categories (e.g. LOC) in Chinese. \ref{['Fig.zh_tsne_1']} and \ref{['Fig.zh_tsne_2']} indicate the token-to-token relation representations between entities w and w/o contrastive learning respectively.
...and 2 more figures

mCL-NER: Cross-Lingual Named Entity Recognition via Multi-view Contrastive Learning

TL;DR

Abstract

mCL-NER: Cross-Lingual Named Entity Recognition via Multi-view Contrastive Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (7)