XAMPLER: Learning to Retrieve Cross-Lingual In-Context Examples

Peiqin Lin; André F. T. Martins; Hinrich Schütze

XAMPLER: Learning to Retrieve Cross-Lingual In-Context Examples

Peiqin Lin, André F. T. Martins, Hinrich Schütze

TL;DR

XAMPLER addresses the challenge of cross-lingual few-shot learning for low-resource languages by training a cross-lingual English-example retriever using only English data. It constructs labeled English candidate pairs via MaLA500 predictions, fine-tunes a Glot500-based retriever with a contrastive loss, and then retrieves English few-shot examples for in-context learning of queries in any language with MaLA500. Evaluations on SIB200 and MasakhaNEWS show consistent improvements over strong baselines, with efficient training and inference, demonstrating the viability of English-only data for broad cross-lingual ICL. The method offers scalable, data-efficient cross-lingual retrieval across hundreds of languages, enabling practical few-shot learning in diverse linguistic settings.

Abstract

Recent studies indicate that leveraging off-the-shelf or fine-tuned retrievers, capable of retrieving relevant in-context examples tailored to the input query, enhances few-shot in-context learning of English. However, adapting these methods to other languages, especially low-resource ones, poses challenges due to the scarcity of cross-lingual retrievers and annotated data. Thus, we introduce XAMPLER: Cross-Lingual Example Retrieval, a method tailored to tackle the challenge of cross-lingual in-context learning using only annotated English data. XAMPLER first trains a retriever based on Glot500, a multilingual small language model, using positive and negative English examples constructed from the predictions of a multilingual large language model, i.e., MaLA500. Leveraging the cross-lingual capacity of the retriever, it can directly retrieve English examples as few-shot examples for in-context learning of target languages. Experiments on two multilingual text classification benchmarks, namely SIB200 with 176 languages and MasakhaNEWS with 16 languages, demonstrate that XAMPLER substantially improves the in-context learning performance across languages. Our code is available at https://github.com/cisnlp/XAMPLER.

XAMPLER: Learning to Retrieve Cross-Lingual In-Context Examples

TL;DR

Abstract

Paper Structure (17 sections, 5 figures, 5 tables)

This paper contains 17 sections, 5 figures, 5 tables.

Introduction
Approach
Problem Definition
Data Construction
Retriever Fine-tuning
In-Context Learning
Experiment
Setup
Benchmark
Baselines
Main Results
Ablation Study
Related Work
Conclusion
KNN Performance Across Layers
...and 2 more sections

Figures (5)

Figure 1: XAMPLER involves three steps: 1. Data Construction: given a query in English $q_i$, we divide the candidate English examples $D_i$ into positive examples $D_{i}^{pos}$ and negative examples $D_{i}^{neg}$ based on the prediction of MaLA500 DBLP:journals/corr/abs-2401-13303; 2. Retriever Fine-tuning: we fine-tune the retriever based on Glot500 DBLP:conf/acl/ImaniLKSSKMSMYS23 using the constructed data; 3. In-Context Learning: given a query in any language $q_j$, we use the fine-tuned retriever to retrieve relevant English examples $D_j$ as few-shots for in-context learning. For training, XAMPLER requires English data only. Once trained, the model can be applied to any of the 500 languages covered by MaLA500/Glot500 without any need for (often unavailable) labeled low-resource data.
Figure 2: KNN (K-Nearest Neighbors) vs. ICL (In-Context Learning) with different number of shots. X-axis: number of shots. Y-axis: Macro-average accuracy.
Figure 3: Results of 10-shot KNN (K-Nearest Neighbors) with Glot500 as retriever across layers.
Figure 4: Results of 10-shot KNN (K-Nearest Neighbors) with MaLA500 as retriever across layers.
Figure 5: In-context learning with XAMPLER with different $k$.

XAMPLER: Learning to Retrieve Cross-Lingual In-Context Examples

TL;DR

Abstract

XAMPLER: Learning to Retrieve Cross-Lingual In-Context Examples

Authors

TL;DR

Abstract

Table of Contents

Figures (5)