Table of Contents
Fetching ...

Reason-KE++: Aligning the Process, Not Just the Outcome, for Faithful LLM Knowledge Editing

Yuchen Wu, Liang Ding, Li Shen, Dacheng Tao

TL;DR

The paper tackles the faithfulness gap in knowledge editing for LLMs performing multi-hop reasoning. It introduces Reason-KE++, a two-stage SFT+RL framework with a Stage-aware Reward that supervises intermediate reasoning steps, not just final answers. By decomposing problems into acknowledged updates, decomposed sub-questions, and sequential actions, and by enforcing formatted, stage-wise feedback, the approach achieves state-of-the-art performance on MQUAKE-CF-3k and demonstrates robustness to distractors and leakage. The work provides a practical pathway to trustworthy LLMs in dynamic knowledge settings and broad generalizability across models and domains.

Abstract

Aligning Large Language Models (LLMs) to be faithful to new knowledge in complex, multi-hop reasoning tasks is a critical, yet unsolved, challenge. We find that SFT-based methods, e.g., Reason-KE, while state-of-the-art, suffer from a "faithfulness gap": they optimize for format mimicry rather than sound reasoning. This gap enables the LLM's powerful parametric priors to override new contextual facts, resulting in critical factual hallucinations (e.g., incorrectly reasoning "Houston" from "NASA" despite an explicit edit). To solve this core LLM alignment problem, we propose Reason-KE++, an SFT+RL framework that instills process-level faithfulness. Its core is a Stage-aware Reward mechanism that provides dense supervision for intermediate reasoning steps (e.g., Decomposition, Sub-answer Correctness). Crucially, we identify that naive outcome-only RL is a deceptive trap for LLM alignment: it collapses reasoning integrity (e.g., 19.00% Hop acc) while superficially boosting final accuracy. Our process-aware framework sets a new SOTA of 95.48% on MQUAKE-CF-3k (+5.28%), demonstrating that for complex tasks, aligning the reasoning process is essential for building trustworthy LLMs.

Reason-KE++: Aligning the Process, Not Just the Outcome, for Faithful LLM Knowledge Editing

TL;DR

The paper tackles the faithfulness gap in knowledge editing for LLMs performing multi-hop reasoning. It introduces Reason-KE++, a two-stage SFT+RL framework with a Stage-aware Reward that supervises intermediate reasoning steps, not just final answers. By decomposing problems into acknowledged updates, decomposed sub-questions, and sequential actions, and by enforcing formatted, stage-wise feedback, the approach achieves state-of-the-art performance on MQUAKE-CF-3k and demonstrates robustness to distractors and leakage. The work provides a practical pathway to trustworthy LLMs in dynamic knowledge settings and broad generalizability across models and domains.

Abstract

Aligning Large Language Models (LLMs) to be faithful to new knowledge in complex, multi-hop reasoning tasks is a critical, yet unsolved, challenge. We find that SFT-based methods, e.g., Reason-KE, while state-of-the-art, suffer from a "faithfulness gap": they optimize for format mimicry rather than sound reasoning. This gap enables the LLM's powerful parametric priors to override new contextual facts, resulting in critical factual hallucinations (e.g., incorrectly reasoning "Houston" from "NASA" despite an explicit edit). To solve this core LLM alignment problem, we propose Reason-KE++, an SFT+RL framework that instills process-level faithfulness. Its core is a Stage-aware Reward mechanism that provides dense supervision for intermediate reasoning steps (e.g., Decomposition, Sub-answer Correctness). Crucially, we identify that naive outcome-only RL is a deceptive trap for LLM alignment: it collapses reasoning integrity (e.g., 19.00% Hop acc) while superficially boosting final accuracy. Our process-aware framework sets a new SOTA of 95.48% on MQUAKE-CF-3k (+5.28%), demonstrating that for complex tasks, aligning the reasoning process is essential for building trustworthy LLMs.

Paper Structure

This paper contains 44 sections, 3 equations, 3 figures, 8 tables.

Figures (3)

  • Figure 1: An illustration of our core motivation. (a) Existing methods often take an unfaithful shortcut based on strong priors, ignoring updated information and leading to incorrect answers. (b) Our ReasonKE++ decomposes the multi-hop query, ensuring a faithful reasoning process that correctly utilizes the new knowledge.
  • Figure 2: The two-stage pipeline of ReasonKE++. It starts with a cold-start SFT phase for foundational learning, followed by a Stage-aware Reinforcement Learning phase. In the RL stage, a dense reward signal, composed of a detailed Process Score and an Outcome Score, is used to optimize the model's ability to generate faithful reasoning.
  • Figure 3: Case study comparing Reason-KE reasonke and Reason-KE++. 'Reason-KE' defaults to its parametric prior (NASA $\rightarrow$ Houston), exhibiting factual hallucination. 'Reason-KE++' uses structured decomposition to faithfully arrive at the correct answer.