Locate-then-edit for Multi-hop Factual Recall under Knowledge Editing
Zhuoran Zhang, Yongxiang Li, Zijian Kan, Keyuan Cheng, Lijie Hu, Di Wang
TL;DR
This work identifies a mechanistic gap in existing knowledge editing methods for multi-hop factual recall, showing that such tasks rely on deeper MLP layers and implicit subject information. It introduces IFMET, a two-stage, interpretability-guided editing framework that uses multi-hop prompts to update both shallow and deep layers, addressing the limitations of prior approaches. Through experiments on MQuAKE-3K (and generalization to LLaMA-2-7B), IFMET delivers substantial improvements in multi-hop recall accuracy and demonstrates the essential role of deeper-layer edits, guided by mechanistic interpretability analyses. The approach offers a principled path toward reliable, scalable knowledge edits in LLMs, with broad implications for maintaining factual correctness in complex reasoning tasks.
Abstract
The locate-then-edit paradigm has shown significant promise for knowledge editing (KE) in Large Language Models (LLMs). While previous methods perform well on single-hop fact recall tasks, they consistently struggle with multi-hop factual recall tasks involving newly edited knowledge. In this paper, leveraging tools in mechanistic interpretability, we first identify that in multi-hop tasks, LLMs tend to retrieve knowledge with implicit subject information from deeper MLP layers, unlike single-hop tasks, which rely on shallow layers. This distinction explains the poor performance of current methods in multi-hop queries, as they primarily focus on editing shallow layers with single-hop edit prompts, leaving deeper layers unchanged. To address this, we propose IFMET, a novel locate-then-edit KE approach designed to edit both shallow and deep MLP layers. Beyond single-hop editing prompts, IFMET further incorporates multi-hop editing prompts to locate and modify knowledge across different stages of reasoning. Experimental results demonstrate that IFMET significantly improves performance on multi-hop factual recall tasks, overcoming the limitations of previous locate-then-edit methods
