Mass-Editing Memory with Attention in Transformers: A cross-lingual exploration of knowledge

Daniel Tamayo; Aitor Gonzalez-Agirre; Javier Hernando; Marta Villegas

Mass-Editing Memory with Attention in Transformers: A cross-lingual exploration of knowledge

Daniel Tamayo, Aitor Gonzalez-Agirre, Javier Hernando, Marta Villegas

TL;DR

The paper tackles the challenge of ground truth and reliable knowledge editing in transformer-based language models, with a focus on cross-lingual capabilities. It extends MEMIT by introducing MEMAT, which identifies and optimizes a set of attention heads in a secondary language to refine edited factual associations, guided by an ITI-inspired framework. Across English and Catalan, MEMAT shows significant gains over MEMIT on multiple metrics with minimal parameter changes, and demonstrates portability to unseen languages, aided by a cross-lingual head-selection strategy. The work highlights the dual roles of subject tokenization and language-independent attention signals in cross-lingual knowledge editing and lays groundwork for more language-robust, explainable editing methods.

Abstract

Recent research has explored methods for updating and modifying factual knowledge in large language models, often focusing on specific multi-layer perceptron blocks. This study expands on this work by examining the effectiveness of existing knowledge editing methods across languages and delving into the role of attention mechanisms in this process. Drawing from the insights gained, we propose Mass-Editing Memory with Attention in Transformers (MEMAT), a method that achieves significant improvements in all metrics while requiring minimal parameter modifications. MEMAT delivers a remarkable 10% increase in magnitude metrics, benefits languages not included in the training data and also demonstrates a high degree of portability. Our code and data are at https://github.com/dtamayo-nlp/MEMAT.

Mass-Editing Memory with Attention in Transformers: A cross-lingual exploration of knowledge

TL;DR

Abstract

Mass-Editing Memory with Attention in Transformers: A cross-lingual exploration of knowledge

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (11)