Breaking Boundaries: Investigating the Effects of Model Editing on Cross-linguistic Performance

Somnath Banerjee; Avik Halder; Rajarshi Mandal; Sayan Layek; Ian Soboroff; Rima Hazra; Animesh Mukherjee

Breaking Boundaries: Investigating the Effects of Model Editing on Cross-linguistic Performance

Somnath Banerjee, Avik Halder, Rajarshi Mandal, Sayan Layek, Ian Soboroff, Rima Hazra, Animesh Mukherjee

TL;DR

The paper investigates how knowledge edits in multilingual LLMs propagate across languages, revealing persistent cross-lingual gaps that hinder linguistic equity. It evaluates ROME and MEMIT editing methods on eight languages using ELFI/ELFO stress tests and two datasets (CounterFact, ZsRE), with translations and a multilingual merging approach for Indic languages. Findings show that while edits can be reliable within a language, cross-lingual transfer is inconsistent, and model merging offers limited, nonuniform gains. The work highlights the need for inclusive multilingual training, systematic testing, and practical strategies—such as expanded data, continual editing, alignment-focused architectures, and dedicated edit modules—to realize linguistically fair AI systems.

Abstract

The integration of pretrained language models (PLMs) like BERT and GPT has revolutionized NLP, particularly for English, but it has also created linguistic imbalances. This paper strategically identifies the need for linguistic equity by examining several knowledge editing techniques in multilingual contexts. We evaluate the performance of models such as Mistral, TowerInstruct, OpenHathi, Tamil-Llama, and Kan-Llama across languages including English, German, French, Italian, Spanish, Hindi, Tamil, and Kannada. Our research identifies significant discrepancies in normal and merged models concerning cross-lingual consistency. We employ strategies like 'each language for itself' (ELFI) and 'each language for others' (ELFO) to stress-test these models. Our findings demonstrate the potential for LLMs to overcome linguistic barriers, laying the groundwork for future research in achieving linguistic inclusivity in AI technologies.

Breaking Boundaries: Investigating the Effects of Model Editing on Cross-linguistic Performance

TL;DR

Abstract

Paper Structure (28 sections, 3 equations, 3 figures, 10 tables)

This paper contains 28 sections, 3 equations, 3 figures, 10 tables.

Introduction
Related work
Task overview
Dataset
Experimental setup
Selection of LLMs
Editing methods
Evaluation metric
Results
Self edit - self inference perspective
English edit - self inference perspective
Merged model perspective
Error analysis
Discussion
Key observations
...and 13 more sections

Figures (3)

Figure 1: Edited knowledge conflict across various languages for TowerInstruct.
Figure 2: Each metric on the $x$-axis is represented by two bars: the left bar indicates an exact match, while the right bar indicates a partial match. For each bar, the divisions along the $y$-axis reflect the average values of the metric, aggregated across Romance and Germanic languages evaluated. These subdivisions are color-coded to denote the editing language, as specified in the legend.
Figure 3: Each metric on the $x$-axis is represented by two bars: the left bar indicates an exact match, while the right bar indicates a partial match. For each bar, the divisions along the $y$-axis reflect the average values of the metric, aggregated across all Indic languages evaluated. These subdivisions are color-coded to denote the editing language, as specified in the legend.

Breaking Boundaries: Investigating the Effects of Model Editing on Cross-linguistic Performance

TL;DR

Abstract

Breaking Boundaries: Investigating the Effects of Model Editing on Cross-linguistic Performance

Authors

TL;DR

Abstract

Table of Contents

Figures (3)