AlignFreeze: Navigating the Impact of Realignment on the Layers of Multilingual Models Across Diverse Languages
Steve Bakos, Félix Gaschi, David Guzmán, Riddhi More, Kelly Chutong Li, En-Shiun Annie Lee
TL;DR
AlignFreeze addresses the inconsistent benefits of realignment for cross-lingual transfer in multilingual language models. By freezing either the lower or upper half of layers during realignment, the method reveals that realignment affects all layers but is especially harmful to lower layers, which AlignFreeze can shield for PoS tagging. Across 4 tasks, 3 models, and 35 languages, front-freezing improves PoS tagging in languages where full realignment fails, and aligns with better generalization than full realignment in several settings. The work highlights that cross-lingual transfer remains hard to predict and that partial freezing offers a practical, language-aware strategy to mitigate forgetting while enhancing transfer for syntactic/morphological tasks.
Abstract
Realignment techniques are often employed to enhance cross-lingual transfer in multilingual language models, still, they can sometimes degrade performance in languages that differ significantly from the fine-tuned source language. This paper introduces AlignFreeze, a method that freezes either the layers' lower half or upper half during realignment. Through controlled experiments on 4 tasks, 3 models, and in 35 languages, we find that realignment affects all the layers but can be the most detrimental to the lower ones. Freezing the lower layers can prevent performance degradation. Particularly, AlignFreeze improves Part-of-Speech (PoS) tagging performances in languages where full realignment fails: with XLM-R, it provides improvements of more than one standard deviation in accuracy in seven more languages than full realignment.
