Breaking Boundaries: Investigating the Effects of Model Editing on Cross-linguistic Performance
Somnath Banerjee, Avik Halder, Rajarshi Mandal, Sayan Layek, Ian Soboroff, Rima Hazra, Animesh Mukherjee
TL;DR
The paper investigates how knowledge edits in multilingual LLMs propagate across languages, revealing persistent cross-lingual gaps that hinder linguistic equity. It evaluates ROME and MEMIT editing methods on eight languages using ELFI/ELFO stress tests and two datasets (CounterFact, ZsRE), with translations and a multilingual merging approach for Indic languages. Findings show that while edits can be reliable within a language, cross-lingual transfer is inconsistent, and model merging offers limited, nonuniform gains. The work highlights the need for inclusive multilingual training, systematic testing, and practical strategies—such as expanded data, continual editing, alignment-focused architectures, and dedicated edit modules—to realize linguistically fair AI systems.
Abstract
The integration of pretrained language models (PLMs) like BERT and GPT has revolutionized NLP, particularly for English, but it has also created linguistic imbalances. This paper strategically identifies the need for linguistic equity by examining several knowledge editing techniques in multilingual contexts. We evaluate the performance of models such as Mistral, TowerInstruct, OpenHathi, Tamil-Llama, and Kan-Llama across languages including English, German, French, Italian, Spanish, Hindi, Tamil, and Kannada. Our research identifies significant discrepancies in normal and merged models concerning cross-lingual consistency. We employ strategies like 'each language for itself' (ELFI) and 'each language for others' (ELFO) to stress-test these models. Our findings demonstrate the potential for LLMs to overcome linguistic barriers, laying the groundwork for future research in achieving linguistic inclusivity in AI technologies.
