Table of Contents
Fetching ...

Cross-Lingual Transfer for Natural Language Inference via Multilingual Prompt Translator

Xiaoyu Qiu, Yuechen Wang, Jiaxin Shi, Wengang Zhou, Houqiang Li

TL;DR

A novel framework, Multilingual Prompt Translator (MPT), is proposed, where a multilingual prompt translator is introduced to properly process crucial knowledge embedded in prompt by changing language knowledge while retaining task knowledge.

Abstract

Based on multilingual pre-trained models, cross-lingual transfer with prompt learning has shown promising effectiveness, where soft prompt learned in a source language is transferred to target languages for downstream tasks, particularly in the low-resource scenario. To efficiently transfer soft prompt, we propose a novel framework, Multilingual Prompt Translator (MPT), where a multilingual prompt translator is introduced to properly process crucial knowledge embedded in prompt by changing language knowledge while retaining task knowledge. Concretely, we first train prompt in source language and employ translator to translate it into target prompt. Besides, we extend an external corpus as auxiliary data, on which an alignment task for predicted answer probability is designed to convert language knowledge, thereby equipping target prompt with multilingual knowledge. In few-shot settings on XNLI, MPT demonstrates superiority over baselines by remarkable improvements. MPT is more prominent compared with vanilla prompting when transferring to languages quite distinct from source language.

Cross-Lingual Transfer for Natural Language Inference via Multilingual Prompt Translator

TL;DR

A novel framework, Multilingual Prompt Translator (MPT), is proposed, where a multilingual prompt translator is introduced to properly process crucial knowledge embedded in prompt by changing language knowledge while retaining task knowledge.

Abstract

Based on multilingual pre-trained models, cross-lingual transfer with prompt learning has shown promising effectiveness, where soft prompt learned in a source language is transferred to target languages for downstream tasks, particularly in the low-resource scenario. To efficiently transfer soft prompt, we propose a novel framework, Multilingual Prompt Translator (MPT), where a multilingual prompt translator is introduced to properly process crucial knowledge embedded in prompt by changing language knowledge while retaining task knowledge. Concretely, we first train prompt in source language and employ translator to translate it into target prompt. Besides, we extend an external corpus as auxiliary data, on which an alignment task for predicted answer probability is designed to convert language knowledge, thereby equipping target prompt with multilingual knowledge. In few-shot settings on XNLI, MPT demonstrates superiority over baselines by remarkable improvements. MPT is more prominent compared with vanilla prompting when transferring to languages quite distinct from source language.
Paper Structure (12 sections, 9 equations, 5 figures, 2 tables)

This paper contains 12 sections, 9 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: The source language training data in yellow and extended parallel data in blue are concatenated with the corresponding prompt for joint training. A multilingual prompt translator is employed to translate the source language prompt ($\boldsymbol{p}^S$) into multilingual prompt as target prompt ($\boldsymbol{p}^T$). In the inference, prediction is made on target language test data in yellow with translated $\boldsymbol{p}^T$.
  • Figure 2: An overview of our proposed MPT, where a prompt translator is designed to translate source language prompt into target language prompt. In addition to original training data in yellow, we incorporate auxiliary data in blue from an extended dataset. MPT is optimized by minimizing the combined CE loss and KLD loss for classification task and cross-lingual alignment task.
  • Figure 3: Accuracy gain of MPT relative to SP among different languages (%).
  • Figure 4: Ablations of the architecture of multilingual prompt translator and the position of soft prompt on MPT performance in accuracy (%).
  • Figure 5: Ablations of various hyper-parameters on MPT performance measured by accuracy (%). From left to right, we sequentially illustrate the impact of changes in prompt length, loss weight $\alpha$, and external data size on MPT performance.