Empirical Study on Updating Key-Value Memories in Transformer Feed-forward Layers
Zihan Qiu, Zeyu Huang, Youcheng Huang, Jie Fu
TL;DR
This work treats transformer FFNs as key-value memories, formalized as $FFN(h) = f(h K^T) V$, and empirically compares updating keys versus updating values across knowledge editing and tuning tasks. Through back-propagation updates (and LoRA variants) in GPT-J (6B), GPT2-xl, and Llama2-7B under 4-bit quantization, the study shows updating keys generally yields better generalization, locality, and efficiency than updating values. The results suggest that altering the mechanism controlling knowledge (keys) is often more effective than directly editing stored content (values), with practical implications for efficient, robust model editing.
Abstract
The feed-forward networks (FFNs) in transformers are recognized as a group of key-value neural memories to restore abstract high-level knowledge. In this work, we conduct an empirical ablation study on updating keys (the 1st layer in the FFNs layer) or values (the 2nd layer in the FFNs layer). We compare those two methods in various knowledge editing and fine-tuning tasks of large language models to draw insights to understand FFNs further. Code is available at $\href{https://github.com/qiuzh20/Tuning-keys-v.s.-values}{this\,repo}$.
