Table of Contents
Fetching ...

Revisiting the Primacy of English in Zero-shot Cross-lingual Transfer

Iulia Turc, Kenton Lee, Jacob Eisenstein, Ming-Wei Chang, Kristina Toutanova

TL;DR

The paper interrogates the assumption that English is the best source language for zero-shot cross-lingual transfer, evaluating mBERT and mT5 on tasks including XNLI, PAWS-X, XQuAD, and TyDi QA. By introducing formal transferability metrics, it shows that languages like German and Russian often provide superior cross-lingual transfer, sometimes even when training data is machine-translated from English. A surprising finding is that translating English training data into a better source language before fine-tuning can improve performance, challenging the notion that native English data is always optimal. These insights offer immediate guidance for multilingual system design and benchmark construction, suggesting a shift away from English as the default transfer language and highlighting the impact of pre-training regimes and translation quality on transfer performance.

Abstract

Despite their success, large pre-trained multilingual models have not completely alleviated the need for labeled data, which is cumbersome to collect for all target languages. Zero-shot cross-lingual transfer is emerging as a practical solution: pre-trained models later fine-tuned on one transfer language exhibit surprising performance when tested on many target languages. English is the dominant source language for transfer, as reinforced by popular zero-shot benchmarks. However, this default choice has not been systematically vetted. In our study, we compare English against other transfer languages for fine-tuning, on two pre-trained multilingual models (mBERT and mT5) and multiple classification and question answering tasks. We find that other high-resource languages such as German and Russian often transfer more effectively, especially when the set of target languages is diverse or unknown a priori. Unexpectedly, this can be true even when the training sets were automatically translated from English. This finding can have immediate impact on multilingual zero-shot systems, and should inform future benchmark designs.

Revisiting the Primacy of English in Zero-shot Cross-lingual Transfer

TL;DR

The paper interrogates the assumption that English is the best source language for zero-shot cross-lingual transfer, evaluating mBERT and mT5 on tasks including XNLI, PAWS-X, XQuAD, and TyDi QA. By introducing formal transferability metrics, it shows that languages like German and Russian often provide superior cross-lingual transfer, sometimes even when training data is machine-translated from English. A surprising finding is that translating English training data into a better source language before fine-tuning can improve performance, challenging the notion that native English data is always optimal. These insights offer immediate guidance for multilingual system design and benchmark construction, suggesting a shift away from English as the default transfer language and highlighting the impact of pre-training regimes and translation quality on transfer performance.

Abstract

Despite their success, large pre-trained multilingual models have not completely alleviated the need for labeled data, which is cumbersome to collect for all target languages. Zero-shot cross-lingual transfer is emerging as a practical solution: pre-trained models later fine-tuned on one transfer language exhibit surprising performance when tested on many target languages. English is the dominant source language for transfer, as reinforced by popular zero-shot benchmarks. However, this default choice has not been systematically vetted. In our study, we compare English against other transfer languages for fine-tuning, on two pre-trained multilingual models (mBERT and mT5) and multiple classification and question answering tasks. We find that other high-resource languages such as German and Russian often transfer more effectively, especially when the set of target languages is diverse or unknown a priori. Unexpectedly, this can be true even when the training sets were automatically translated from English. This finding can have immediate impact on multilingual zero-shot systems, and should inform future benchmark designs.

Paper Structure

This paper contains 23 sections, 4 equations, 1 figure, 8 tables.

Figures (1)

  • Figure 1: Is English the best language for zero-shot cross-lingual transfer? In current literature, English is the dominant transfer language for fine-tuning (step 2). In this study, we investigate whether this is the most effective choice on standard multilingual benchmarks.