Knowledge Localization in Mixture-of-Experts LLMs Using Cross-Lingual Inconsistency

Lucas Bandarkar; Alan Ansell; Trevor Cohn

Knowledge Localization in Mixture-of-Experts LLMs Using Cross-Lingual Inconsistency

Lucas Bandarkar, Alan Ansell, Trevor Cohn

Abstract

Modern LLMs continue to exhibit significant variance in behavior across languages, such as being able to recall factual information in some languages but not others. While typically studied as a problem to be mitigated, in this work, we propose leveraging this cross-lingual inconsistency as a tool for interpretability in mixture-of-experts (MoE) LLMs. Our knowledge localization framework contrasts routing for sets of languages where the model correctly recalls information from languages where it fails. This allows us to isolate model components that play a functional role in answering about a piece of knowledge. Our method proceeds in two stages: (1) querying the model with difficult factual questions across a diverse set of languages to generate "success" and "failure" activation buckets and then (2) applying a statistical contrastive analysis to the MoE router logits to identify experts important for knowledge. To validate the necessity of this small number of experts for answering a knowledge question, we deactivate them and re-ask the question. We find that despite only deactivating about 20 out of 6000 experts, the model no longer answers correctly in over 40% of cases. Generally, this method provides a realistic and scalable knowledge localization approach to address increasingly complex LLMs.

Knowledge Localization in Mixture-of-Experts LLMs Using Cross-Lingual Inconsistency

Abstract

Paper Structure (45 sections, 3 equations, 4 figures, 3 tables)

This paper contains 45 sections, 3 equations, 4 figures, 3 tables.

Introduction
Related Work
Knowledge Localization and Editing
Interpretability in Mixture-of-Experts
Language-Agnostic Representations
Setup
Cross-Lingual Inconsistency
Expert Identification Methodology
MoE Preliminaries and Notation
Routing Data Collection
Avoiding Generalist Experts
A. Excluding Top and Bottom Layers
B. Blacklisting Top Experts
C. Subtracting Language-Specific Average
Statistical MWU Test
...and 30 more sections

Figures (4)

Figure 1: Overview of our knowledge localization framework for MoEs that uses Cross-Lingual Inconsistency for Contrastive Identification (XICI)
Figure 2: Impact of different "max experts" values for our method, when applied to GLM-4.5-Air on MultiLoKo questions.
Figure 3: Location of experts identified for GLM-4.5-Air.
Figure 4: Location of experts identified for Qwen3-30B-A3B.

Knowledge Localization in Mixture-of-Experts LLMs Using Cross-Lingual Inconsistency

Abstract

Knowledge Localization in Mixture-of-Experts LLMs Using Cross-Lingual Inconsistency

Authors

Abstract

Table of Contents

Figures (4)