Table of Contents
Fetching ...

Zero-Shot Cross-Lingual Transfer using Prefix-Based Adaptation

Snegha A, Sayambhu Sen, Piyush Singh Pasi, Abhishek Singhania, Preethi Jyothi

TL;DR

This work systematically evaluates prefix-based parameter-efficient tuning (including soft prompts, prefix tuning, and Llama Adapter) for zero-shot cross-lingual transfer in decoder-only LLMs, comparing against LoRA and full fine-tuning across 1B–24B models on XNLI, XQUAD, Belebele, and MGSM. It demonstrates that prefix-based methods achieve consistent cross-lingual gains, often outperforming LoRA and approaching or surpassing full fine-tuning on multilingual benchmarks while using as little as 1.23M trainable parameters. The study also analyzes transfer patterns across language families and scripts, model-size scaling behavior, and hyperparameter sensitivities, revealing that adapting a subset of layers with appropriately sized prefixes yields strong, scalable cross-lingual performance, especially in low-resource settings. Overall, prefix-based adaptation emerges as a robust, efficient alternative for multilingual deployment of decoder-only LLMs, with practical implications for resource-constrained cross-lingual applications.

Abstract

With the release of new large language models (LLMs) like Llama and Mistral, zero-shot cross-lingual transfer has become increasingly feasible due to their multilingual pretraining and strong generalization capabilities. However, adapting these decoder-only LLMs to new tasks across languages remains challenging. While parameter-efficient fine-tuning (PeFT) techniques like Low-Rank Adaptation (LoRA) are widely used, prefix-based techniques such as soft prompt tuning, prefix tuning, and Llama Adapter are less explored, especially for zero-shot transfer in decoder-only models. We present a comprehensive study of three prefix-based methods for zero-shot cross-lingual transfer from English to 35+ high- and low-resource languages. Our analysis further explores transfer across linguistic families and scripts, as well as the impact of scaling model sizes from 1B to 24B. With Llama 3.1 8B, prefix methods outperform LoRA-baselines by up to 6% on the Belebele benchmark. Similar improvements were observed with Mistral v0.3 7B as well. Despite using only 1.23M learning parameters with prefix tuning, we achieve consistent improvements across diverse benchmarks. These findings highlight the potential of prefix-based techniques as an effective and scalable alternative to LoRA, particularly in low-resource multilingual settings.

Zero-Shot Cross-Lingual Transfer using Prefix-Based Adaptation

TL;DR

This work systematically evaluates prefix-based parameter-efficient tuning (including soft prompts, prefix tuning, and Llama Adapter) for zero-shot cross-lingual transfer in decoder-only LLMs, comparing against LoRA and full fine-tuning across 1B–24B models on XNLI, XQUAD, Belebele, and MGSM. It demonstrates that prefix-based methods achieve consistent cross-lingual gains, often outperforming LoRA and approaching or surpassing full fine-tuning on multilingual benchmarks while using as little as 1.23M trainable parameters. The study also analyzes transfer patterns across language families and scripts, model-size scaling behavior, and hyperparameter sensitivities, revealing that adapting a subset of layers with appropriately sized prefixes yields strong, scalable cross-lingual performance, especially in low-resource settings. Overall, prefix-based adaptation emerges as a robust, efficient alternative for multilingual deployment of decoder-only LLMs, with practical implications for resource-constrained cross-lingual applications.

Abstract

With the release of new large language models (LLMs) like Llama and Mistral, zero-shot cross-lingual transfer has become increasingly feasible due to their multilingual pretraining and strong generalization capabilities. However, adapting these decoder-only LLMs to new tasks across languages remains challenging. While parameter-efficient fine-tuning (PeFT) techniques like Low-Rank Adaptation (LoRA) are widely used, prefix-based techniques such as soft prompt tuning, prefix tuning, and Llama Adapter are less explored, especially for zero-shot transfer in decoder-only models. We present a comprehensive study of three prefix-based methods for zero-shot cross-lingual transfer from English to 35+ high- and low-resource languages. Our analysis further explores transfer across linguistic families and scripts, as well as the impact of scaling model sizes from 1B to 24B. With Llama 3.1 8B, prefix methods outperform LoRA-baselines by up to 6% on the Belebele benchmark. Similar improvements were observed with Mistral v0.3 7B as well. Despite using only 1.23M learning parameters with prefix tuning, we achieve consistent improvements across diverse benchmarks. These findings highlight the potential of prefix-based techniques as an effective and scalable alternative to LoRA, particularly in low-resource multilingual settings.

Paper Structure

This paper contains 26 sections, 12 equations, 3 figures, 19 tables.

Figures (3)

  • Figure 1: Schematic representation of: (A) LoRA fine-tuning and prefix-based methods, (B) Llama Adapter, (C) Prefix tuning, and (D) Soft prompt tuning.
  • Figure 2: Comparison of prefix-based methods across model sizes against LoRA fine-tuning on XQUAD (F1 score).
  • Figure 3: Varying temperature (left) and top-p (right) values using Llama 3.2 (1B) on the XQUAD task.