Table of Contents
Fetching ...

Comparison between parameter-efficient techniques and full fine-tuning: A case study on multilingual news article classification

Olesya Razuvayevskaya, Ben Wu, Joao A. Leite, Freddy Heppell, Ivan Srba, Carolina Scarton, Kalina Bontcheva, Xingyi Song

TL;DR

This paper systematically compares parameter-efficient fine-tuning (PEFT) approaches—LoRA and bottleneck adapters—against full fine-tuning for multilingual, multilabel news article classification across three SemEval-2023 Task 3 sub-tasks (genre, framing, persuasion). It explores three training scenarios (multilingual joint, translate-train, and English-only) using XLM-RoBERTa Large, with BitFit excluded after ablation and Pfeiffer adapters generally preferred over Houlsby. The study reports significant compute savings for PEFT methods (roughly 140–280× fewer trainable parameters and 32–44% shorter training time) with varying effects on accuracy across tasks and languages; LoRA performs especially well on Persuasion Techniques under multilingual joint training and achieves leaderboard-like gains, while FFT and adapters often excel on longer texts. These findings inform when and how to deploy PEFT in multilingual, non-parallel, long-text classification settings, including zero-shot cross-lingual transfer and translation-based training.

Abstract

Adapters and Low-Rank Adaptation (LoRA) are parameter-efficient fine-tuning techniques designed to make the training of language models more efficient. Previous results demonstrated that these methods can even improve performance on some classification tasks. This paper complements the existing research by investigating how these techniques influence the classification performance and computation costs compared to full fine-tuning when applied to multilingual text classification tasks (genre, framing, and persuasion techniques detection; with different input lengths, number of predicted classes and classification difficulty), some of which have limited training data. In addition, we conduct in-depth analyses of their efficacy across different training scenarios (training on the original multilingual data; on the translations into English; and on a subset of English-only data) and different languages. Our findings provide valuable insights into the applicability of the parameter-efficient fine-tuning techniques, particularly to complex multilingual and multilabel classification tasks.

Comparison between parameter-efficient techniques and full fine-tuning: A case study on multilingual news article classification

TL;DR

This paper systematically compares parameter-efficient fine-tuning (PEFT) approaches—LoRA and bottleneck adapters—against full fine-tuning for multilingual, multilabel news article classification across three SemEval-2023 Task 3 sub-tasks (genre, framing, persuasion). It explores three training scenarios (multilingual joint, translate-train, and English-only) using XLM-RoBERTa Large, with BitFit excluded after ablation and Pfeiffer adapters generally preferred over Houlsby. The study reports significant compute savings for PEFT methods (roughly 140–280× fewer trainable parameters and 32–44% shorter training time) with varying effects on accuracy across tasks and languages; LoRA performs especially well on Persuasion Techniques under multilingual joint training and achieves leaderboard-like gains, while FFT and adapters often excel on longer texts. These findings inform when and how to deploy PEFT in multilingual, non-parallel, long-text classification settings, including zero-shot cross-lingual transfer and translation-based training.

Abstract

Adapters and Low-Rank Adaptation (LoRA) are parameter-efficient fine-tuning techniques designed to make the training of language models more efficient. Previous results demonstrated that these methods can even improve performance on some classification tasks. This paper complements the existing research by investigating how these techniques influence the classification performance and computation costs compared to full fine-tuning when applied to multilingual text classification tasks (genre, framing, and persuasion techniques detection; with different input lengths, number of predicted classes and classification difficulty), some of which have limited training data. In addition, we conduct in-depth analyses of their efficacy across different training scenarios (training on the original multilingual data; on the translations into English; and on a subset of English-only data) and different languages. Our findings provide valuable insights into the applicability of the parameter-efficient fine-tuning techniques, particularly to complex multilingual and multilabel classification tasks.
Paper Structure (12 sections, 4 equations, 16 tables)