MPN: Leveraging Multilingual Patch Neuron for Cross-lingual Model Editing
Nianwen Si, Hao Zhang, Weiqiang Zhang
TL;DR
This work tackles the problem of keeping multilingual LLMs up to date by addressing cross-lingual editing, where updates in one language do not reliably propagate to others. It introduces Multilingual Patch Neuron (MPN), a simple input-side augmentation to Transformer-patcher-based editing that trains patch neurons using English editing data in conjunction with parallel corpora, enabling cross-lingual updates with minimal changes to existing methods. Empirical results on XFEVER and XNLI show that MPN substantially improves cross-lingual generalization (CLG) by about 9–12% on XFEVER and around 14% on XNLI, while preserving locality and yielding good monolingual generalization. The method leverages the multilingual model’s inherent cross-lingual representations by embedding cross-lingual sampling into the editing process, offering a practical, easily adaptable path to multilingual model editing.
Abstract
Large language models are known for encoding a vast amount of factual knowledge, but they often becomes outdated due to the ever-changing nature of external information. A promising solution to this challenge is the utilization of model editing methods to update the knowledge in an efficient manner. However, the majority of existing model editing techniques are limited to monolingual frameworks, thus failing to address the crucial issue of cross-lingual knowledge synchronization for multilingual models. To tackle this problem, we propose a simple yet effective method that trains multilingual patch neuron to store cross-lingual knowledge. It can be easily adapted to existing approaches to enhance their cross-lingual editing capabilities. To evaluate our method, we conduct experiments using both the XNLI dataset and a self-constructed XFEVER dataset. Experimental results demonstrate that our proposed method achieves improved performance in cross-lingual editing tasks without requiring excessive modifications to the original methodology, thereby showcasing its user-friendly characteristics. Codes will be released soon.
