No Train but Gain: Language Arithmetic for training-free Language Adapters enhancement
Mateusz Klimaszewski, Piotr Andruszkiewicz, Alexandra Birch
TL;DR
The paper tackles negative interference and limited positive transfer in multilingual pre-trained language models by introducing Language Arithmetic (LA), a training-free post-processing technique that extends the MAD-X cross-lingual framework to operate on language adapters. By formulating language vectors and additive combinations, LA blends knowledge from related languages to improve zero-shot and low-resource transfer without additional training. Empirical results on NER, NLI, and QA across 13 languages using XLM-R and mBERT demonstrate sizable zero-shot gains and robust improvements when existing adapters are enhanced, with a particular edge in challenging low-resource scenarios. The work highlights the practicality of rapid language prototyping and provides analysis on lambda selection and language relatedness, suggesting broader applicability of training-free arithmetic in multilingual NLP.
Abstract
Modular deep learning is the state-of-the-art solution for lifting the curse of multilinguality, preventing the impact of negative interference and enabling cross-lingual performance in Multilingual Pre-trained Language Models. However, a trade-off of this approach is the reduction in positive transfer learning from closely related languages. In response, we introduce a novel method called language arithmetic, which enables training-free post-processing to address this limitation. Extending the task arithmetic framework, we apply learning via addition to the language adapters, transitioning the framework from a multi-task to a multilingual setup. The effectiveness of the proposed solution is demonstrated on three downstream tasks in a MAD-X-based set of cross-lingual schemes, acting as a post-processing procedure. Language arithmetic consistently improves the baselines with significant gains, especially in the most challenging case of zero-shot application. Our code and models are available at https://github.com/mklimasz/language-arithmetic .
