Unveiling Linguistic Regions in Large Language Models

Zhihao Zhang; Jun Zhao; Qi Zhang; Tao Gui; Xuanjing Huang

Unveiling Linguistic Regions in Large Language Models

Zhihao Zhang, Jun Zhao, Qi Zhang, Tao Gui, Xuanjing Huang

TL;DR

This work probes the internal mechanisms underlying cross-lingual alignment in large language models by partitioning parameters into a core linguistic region and language-specific monolingual regions. It localizes a core region comprising roughly $1\%$ of parameters whose perturbation severely degrades multilingual competence, and demonstrates distinct monolingual regions that selectively affect individual languages. Through structured and dimensional perturbations, the authors reveal strong dimensional dependence within the core region and interdependencies among its dimensions. They further show that freezing the core region during additional pre-training can mitigate catastrophic forgetting, offering a practical approach to balance language learning and retention without adding new components. Overall, the findings provide a mechanistic lens on LLM linguistic capabilities and point to targeted strategies for efficient multilingual fine-tuning and continual learning.

Abstract

Large Language Models (LLMs) have demonstrated considerable cross-lingual alignment and generalization ability. Current research primarily focuses on improving LLMs' cross-lingual generalization capabilities. However, there is still a lack of research on the intrinsic mechanisms of how LLMs achieve cross-lingual alignment. From the perspective of region partitioning, this paper conducts several investigations on the linguistic competence of LLMs. We discover a core region in LLMs that corresponds to linguistic competence, accounting for approximately 1% of the total model parameters. Removing this core region by setting parameters to zero results in a significant performance decrease across 30 different languages. Furthermore, this core region exhibits significant dimensional dependence, perturbations to even a single parameter on specific dimensions leading to a loss of linguistic competence. Moreover, we discover that distinct monolingual regions exist for different languages, and disruption to these specific regions substantially reduces the LLMs' proficiency in those corresponding languages. Our research also indicates that freezing the core linguistic region during further pre-training can mitigate the issue of catastrophic forgetting (CF), a common phenomenon observed during further pre-training of LLMs. Overall, exploring the LLMs' functional regions provides insights into the foundation of their intelligence.

Unveiling Linguistic Regions in Large Language Models

TL;DR

of parameters whose perturbation severely degrades multilingual competence, and demonstrates distinct monolingual regions that selectively affect individual languages. Through structured and dimensional perturbations, the authors reveal strong dimensional dependence within the core region and interdependencies among its dimensions. They further show that freezing the core region during additional pre-training can mitigate catastrophic forgetting, offering a practical approach to balance language learning and retention without adding new components. Overall, the findings provide a mechanistic lens on LLM linguistic capabilities and point to targeted strategies for efficient multilingual fine-tuning and continual learning.

Abstract

Paper Structure (42 sections, 5 equations, 10 figures, 17 tables)

This paper contains 42 sections, 5 equations, 10 figures, 17 tables.

Introduction
Background and Metric
Model Pre-training
Parameter Importance
Experiments
Experimental Setup
Core Linguistic Competence Region
Logical Evidence
Empirical Evidence 1
Empirical Evidence 2
Dimensional Dependence
Structured Removal
Single Dimension Perturbation
Single Parameter Perturbation
Ablation Study
...and 27 more sections

Figures (10)

Figure 1: Three main findings of our experiments: (1) Identification of core language regions within the LLMs, where removals lead to linguistic competence loss; (2) Discovery of monolingual regions, where removals cause significant proficiency loss in specific languages; (3) Optimization of freezing core regions during further pre-training decelerates language forgetting.
Figure 2: Visualization of the linguistic competence region (the 'Top' 5% region). The scale from 0 to 1 (after normalization) represent the proportion of parameters within a $3\times3$ vicinity that belong to the 'Top' region.
Figure 3: Perplexity of the LLaMA-2-13B when perturbing one single dimension (Att.O and FFN.down columns) across all layers. 'Top k' represents the top k dimensions that disrupt the model the most. 'Random selected' refers to a randomly chosen dimension. 'Original' indicates that no dimensions are disrupted.
Figure 4: Comparison of linguistic competence. Expanding a single parameter to four times leads to error language competence in LLaMA-2-13B, a 13 billion-parameter LLM.
Figure 5: Model's accuracy on MMLU and ArabicMMLU (filtered) test. Here, 'filtered' denotes removing questions whose correct answer is 'A'.
...and 5 more figures

Unveiling Linguistic Regions in Large Language Models

TL;DR

Abstract

Unveiling Linguistic Regions in Large Language Models

Authors

TL;DR

Abstract

Table of Contents

Figures (10)