Table of Contents
Fetching ...

Unlocking Parameter-Efficient Fine-Tuning for Low-Resource Language Translation

Tong Su, Xin Peng, Sarubi Thillainathan, David Guzmán, Surangika Ranathunga, En-Shiun Annie Lee

TL;DR

The paper addresses the challenge of adapting large pre-trained MT models to low-resource languages with limited data and compute. It conducts a comprehensive empirical study of 8 PEFT methods across 15 architectures for low-resource NMT, using SacreBLEU to evaluate in-domain and out-domain performance with mBART-50 as the base model. The results show that bottleneck adapters, especially Houlsby and Houlsby+Inversion, yield strong translation quality improvements while maintaining efficiency, with Pfeiffer offering the fastest training times; these findings generalize across multiple language pairs and domains. The work provides actionable guidelines for selecting PEFT methods in low-resource translation and points to future directions, including more language-specific adapters and additional evaluation metrics to better capture human judgments.

Abstract

Parameter-efficient fine-tuning (PEFT) methods are increasingly vital in adapting large-scale pre-trained language models for diverse tasks, offering a balance between adaptability and computational efficiency. They are important in Low-Resource Language (LRL) Neural Machine Translation (NMT) to enhance translation accuracy with minimal resources. However, their practical effectiveness varies significantly across different languages. We conducted comprehensive empirical experiments with varying LRL domains and sizes to evaluate the performance of 8 PEFT methods with in total of 15 architectures using the SacreBLEU score. We showed that 6 PEFT architectures outperform the baseline for both in-domain and out-domain tests and the Houlsby+Inversion adapter has the best performance overall, proving the effectiveness of PEFT methods.

Unlocking Parameter-Efficient Fine-Tuning for Low-Resource Language Translation

TL;DR

The paper addresses the challenge of adapting large pre-trained MT models to low-resource languages with limited data and compute. It conducts a comprehensive empirical study of 8 PEFT methods across 15 architectures for low-resource NMT, using SacreBLEU to evaluate in-domain and out-domain performance with mBART-50 as the base model. The results show that bottleneck adapters, especially Houlsby and Houlsby+Inversion, yield strong translation quality improvements while maintaining efficiency, with Pfeiffer offering the fastest training times; these findings generalize across multiple language pairs and domains. The work provides actionable guidelines for selecting PEFT methods in low-resource translation and points to future directions, including more language-specific adapters and additional evaluation metrics to better capture human judgments.

Abstract

Parameter-efficient fine-tuning (PEFT) methods are increasingly vital in adapting large-scale pre-trained language models for diverse tasks, offering a balance between adaptability and computational efficiency. They are important in Low-Resource Language (LRL) Neural Machine Translation (NMT) to enhance translation accuracy with minimal resources. However, their practical effectiveness varies significantly across different languages. We conducted comprehensive empirical experiments with varying LRL domains and sizes to evaluate the performance of 8 PEFT methods with in total of 15 architectures using the SacreBLEU score. We showed that 6 PEFT architectures outperform the baseline for both in-domain and out-domain tests and the Houlsby+Inversion adapter has the best performance overall, proving the effectiveness of PEFT methods.
Paper Structure (10 sections, 3 figures, 8 tables)

This paper contains 10 sections, 3 figures, 8 tables.

Figures (3)

  • Figure 1: Full list of 8 PEFT methods and 15 architectures. Each color box represents a specific structure appearing in the PEFT methods. The same color represents the PEFT methods that share a similar structure.
  • Figure 2: Average $\Delta\%$ compared to baseline for each dataset tested on in-domain and out-of-domain.
  • Figure 3: Performance of LRL Translation Pairs by Fine-Tuning Dataset Size (In-Domain only).