Table of Contents
Fetching ...

Editing Across Languages: A Survey of Multilingual Knowledge Editing

Nadir Durrani, Basel Mousi, Fahim Dalvi

TL;DR

This work addresses the challenge of keeping multilingual LLMs factually current by defining Multilingual Knowledge Editing (MKE) and surveying four major method families: Parameter Editing, Memory-based, Fine-tuning, and Hypernetwork-based approaches. It analyzes cross-lingual propagation, presents a taxonomy of methods, and reviews multilingual benchmarks and evaluation criteria, highlighting that no approach yet achieves perfect locality, propagation, and efficiency across all languages. Key findings indicate that X-KDE offers the best overall balance, while language relatedness, model size, and instruction tuning significantly influence cross-lingual transfer and robustness. The paper outlines open challenges—such as language anisotropy and benchmark fragmentation—and presents opportunities for robust multilingual benchmarks, language-aware editing, and cross-model transfer to advance editable, language-aware LLMs with practical impact.

Abstract

While Knowledge Editing has been extensively studied in monolingual settings, it remains underexplored in multilingual contexts. This survey systematizes recent research on Multilingual Knowledge Editing (MKE), a growing subdomain of model editing focused on ensuring factual edits generalize reliably across languages. We present a comprehensive taxonomy of MKE methods, covering parameter-based, memory-based, fine-tuning, and hypernetwork approaches. We survey available benchmarks,summarize key findings on method effectiveness and transfer patterns, identify challenges in cross-lingual propagation, and highlight open problems related to language anisotropy, evaluation coverage, and edit scalability. Our analysis consolidates a rapidly evolving area and lays the groundwork for future progress in editable language-aware LLMs.

Editing Across Languages: A Survey of Multilingual Knowledge Editing

TL;DR

This work addresses the challenge of keeping multilingual LLMs factually current by defining Multilingual Knowledge Editing (MKE) and surveying four major method families: Parameter Editing, Memory-based, Fine-tuning, and Hypernetwork-based approaches. It analyzes cross-lingual propagation, presents a taxonomy of methods, and reviews multilingual benchmarks and evaluation criteria, highlighting that no approach yet achieves perfect locality, propagation, and efficiency across all languages. Key findings indicate that X-KDE offers the best overall balance, while language relatedness, model size, and instruction tuning significantly influence cross-lingual transfer and robustness. The paper outlines open challenges—such as language anisotropy and benchmark fragmentation—and presents opportunities for robust multilingual benchmarks, language-aware editing, and cross-model transfer to advance editable, language-aware LLMs with practical impact.

Abstract

While Knowledge Editing has been extensively studied in monolingual settings, it remains underexplored in multilingual contexts. This survey systematizes recent research on Multilingual Knowledge Editing (MKE), a growing subdomain of model editing focused on ensuring factual edits generalize reliably across languages. We present a comprehensive taxonomy of MKE methods, covering parameter-based, memory-based, fine-tuning, and hypernetwork approaches. We survey available benchmarks,summarize key findings on method effectiveness and transfer patterns, identify challenges in cross-lingual propagation, and highlight open problems related to language anisotropy, evaluation coverage, and edit scalability. Our analysis consolidates a rapidly evolving area and lays the groundwork for future progress in editable language-aware LLMs.

Paper Structure

This paper contains 53 sections, 4 equations, 3 figures, 4 tables.

Figures (3)

  • Figure 1: Comparison of multilingual knowledge editing methods across four evaluation criteria: reliability, generality, locality, and portability. Methods are grouped and color-coded by method family. Scores represent an approximate synthesis of reported results from recent studies zhang-etal-2025-multilingualXKDE
  • Figure 2: Average Reliability and Generality for MEMLA xie2024memlaenhancingmultilingualknowledge across twelve languages, illustrating a typical directionality pattern: edits in high-resource or related languages transfer better than those in low-resource or distant ones.
  • Figure 3: Generality scores for BLOOMZ and LLaMA models across increasing model sizes. ReMaKE wang-etal-2024-retrieval (BLOOMZ) results are extracted for mzsRE generality, averaged across 10 languages. IKE nie2025bmike53investigatingcrosslingualknowledge (LLaMA) results are extracted for mzsRE generality under zero-shot settings, averaged across 53 languages. Results show that generality consistently improves with model scale for both models.