Table of Contents
Fetching ...

Locate-then-edit for Multi-hop Factual Recall under Knowledge Editing

Zhuoran Zhang, Yongxiang Li, Zijian Kan, Keyuan Cheng, Lijie Hu, Di Wang

TL;DR

This work identifies a mechanistic gap in existing knowledge editing methods for multi-hop factual recall, showing that such tasks rely on deeper MLP layers and implicit subject information. It introduces IFMET, a two-stage, interpretability-guided editing framework that uses multi-hop prompts to update both shallow and deep layers, addressing the limitations of prior approaches. Through experiments on MQuAKE-3K (and generalization to LLaMA-2-7B), IFMET delivers substantial improvements in multi-hop recall accuracy and demonstrates the essential role of deeper-layer edits, guided by mechanistic interpretability analyses. The approach offers a principled path toward reliable, scalable knowledge edits in LLMs, with broad implications for maintaining factual correctness in complex reasoning tasks.

Abstract

The locate-then-edit paradigm has shown significant promise for knowledge editing (KE) in Large Language Models (LLMs). While previous methods perform well on single-hop fact recall tasks, they consistently struggle with multi-hop factual recall tasks involving newly edited knowledge. In this paper, leveraging tools in mechanistic interpretability, we first identify that in multi-hop tasks, LLMs tend to retrieve knowledge with implicit subject information from deeper MLP layers, unlike single-hop tasks, which rely on shallow layers. This distinction explains the poor performance of current methods in multi-hop queries, as they primarily focus on editing shallow layers with single-hop edit prompts, leaving deeper layers unchanged. To address this, we propose IFMET, a novel locate-then-edit KE approach designed to edit both shallow and deep MLP layers. Beyond single-hop editing prompts, IFMET further incorporates multi-hop editing prompts to locate and modify knowledge across different stages of reasoning. Experimental results demonstrate that IFMET significantly improves performance on multi-hop factual recall tasks, overcoming the limitations of previous locate-then-edit methods

Locate-then-edit for Multi-hop Factual Recall under Knowledge Editing

TL;DR

This work identifies a mechanistic gap in existing knowledge editing methods for multi-hop factual recall, showing that such tasks rely on deeper MLP layers and implicit subject information. It introduces IFMET, a two-stage, interpretability-guided editing framework that uses multi-hop prompts to update both shallow and deep layers, addressing the limitations of prior approaches. Through experiments on MQuAKE-3K (and generalization to LLaMA-2-7B), IFMET delivers substantial improvements in multi-hop recall accuracy and demonstrates the essential role of deeper-layer edits, guided by mechanistic interpretability analyses. The approach offers a principled path toward reliable, scalable knowledge edits in LLMs, with broad implications for maintaining factual correctness in complex reasoning tasks.

Abstract

The locate-then-edit paradigm has shown significant promise for knowledge editing (KE) in Large Language Models (LLMs). While previous methods perform well on single-hop fact recall tasks, they consistently struggle with multi-hop factual recall tasks involving newly edited knowledge. In this paper, leveraging tools in mechanistic interpretability, we first identify that in multi-hop tasks, LLMs tend to retrieve knowledge with implicit subject information from deeper MLP layers, unlike single-hop tasks, which rely on shallow layers. This distinction explains the poor performance of current methods in multi-hop queries, as they primarily focus on editing shallow layers with single-hop edit prompts, leaving deeper layers unchanged. To address this, we propose IFMET, a novel locate-then-edit KE approach designed to edit both shallow and deep MLP layers. Beyond single-hop editing prompts, IFMET further incorporates multi-hop editing prompts to locate and modify knowledge across different stages of reasoning. Experimental results demonstrate that IFMET significantly improves performance on multi-hop factual recall tasks, overcoming the limitations of previous locate-then-edit methods
Paper Structure (33 sections, 11 equations, 8 figures, 14 tables, 1 algorithm)

This paper contains 33 sections, 11 equations, 8 figures, 14 tables, 1 algorithm.

Figures (8)

  • Figure 1: (a) The existing locate-then-edit KE method updates new fact to the shallow layers of the model using a single-hop edit prompt. (b) For multi-hop fact recall tasks, especially when the edited fact is in the second or subsequent hops, the hops typically access the deeper layers which outputs the unmodified knowledge. (c) Our method introduces a prefix hop for each single-hop edit, creating a two-hop edit prompt. We utilize this new prompt to perform a furtherance edit, targeting the deeper layers for more effective knowledge updating.
  • Figure 2: LogitLens results of the last token position at different layers. Yellow line represents the information containing implicit subject $s_2$, i.e., $\text{Info}(h_l,s_2)$. Blue line represents the information for the final answer, i.e., $\text{Info}(h_l,o_2)$. Larger versions of the sub-figures are available in Appendix Figure \ref{['fig:larger-logitlens-results']}.
  • Figure 3: Causal Intervention Result: A brighter color signifies a stronger intervention effect. Note that negative effect values ($\leq$ 0) are clipped to 0 in both groups for better visualization. (a) is probability change $IE_h$ of intervention $\mathcal{I}_h$, (b) is probability change $IE_m$ of intervention $\mathcal{I}_m$.
  • Figure 4: Relation number for 1-hop and 2-hop
  • Figure 5: Causal Intervention result of MLP input in last token position in Single-hop case
  • ...and 3 more figures