Table of Contents
Fetching ...

Reforming the Mechanism: Editing Reasoning Patterns in LLMs with Circuit Reshaping

Zhenyu Lei, Qiong Wu, Jianxiong Dong, Yinhan He, Emily Dodwell, Yushun Dong, Jundong Li

TL;DR

The Circuit-Interference Law is uncovered, and REdit is proposed, the first framework to actively reshape neural circuits before editing, thereby modulating interference between reasoning patterns and mitigating the trade-off.

Abstract

Large language models (LLMs) often exhibit flawed reasoning ability that undermines reliability. Existing approaches to improving reasoning typically treat it as a general and monolithic skill, applying broad training which is inefficient and unable to target specific reasoning errors. We introduce Reasoning Editing, a paradigm for selectively modifying specific reasoning patterns in LLMs while preserving other reasoning pathways. This task presents a fundamental trade-off between Generality, the ability of an edit to generalize across different tasks sharing the same reasoning pattern, and Locality, the ability to preserve other reasoning capabilities. Through systematic investigation, we uncover the Circuit-Interference Law: Edit interference between reasoning patterns is proportional to the overlap of their neural circuits. Guided by this principle, we propose REdit, the first framework to actively reshape neural circuits before editing, thereby modulating interference between reasoning patterns and mitigating the trade-off. REdit integrates three components: (i) Contrastive Circuit Reshaping, which directly addresses the generality-locality trade-off by disentangling overlapping circuits; (ii) Meta-Contrastive Learning, which extends transferability to novel reasoning patterns; and (iii) Dual-Level Protection, which preserves preexisting abilities by constraining reshaping update directions and regularizing task-level predictions. Extensive experiments with Qwen-2.5-3B on propositional logic reasoning tasks across three difficulty levels demonstrate that REdit consistently achieves superior generality and locality compared to baselines, with additional validation in mathematics showing broader potential. Our code is available at https://github.com/LzyFischer/REdit.

Reforming the Mechanism: Editing Reasoning Patterns in LLMs with Circuit Reshaping

TL;DR

The Circuit-Interference Law is uncovered, and REdit is proposed, the first framework to actively reshape neural circuits before editing, thereby modulating interference between reasoning patterns and mitigating the trade-off.

Abstract

Large language models (LLMs) often exhibit flawed reasoning ability that undermines reliability. Existing approaches to improving reasoning typically treat it as a general and monolithic skill, applying broad training which is inefficient and unable to target specific reasoning errors. We introduce Reasoning Editing, a paradigm for selectively modifying specific reasoning patterns in LLMs while preserving other reasoning pathways. This task presents a fundamental trade-off between Generality, the ability of an edit to generalize across different tasks sharing the same reasoning pattern, and Locality, the ability to preserve other reasoning capabilities. Through systematic investigation, we uncover the Circuit-Interference Law: Edit interference between reasoning patterns is proportional to the overlap of their neural circuits. Guided by this principle, we propose REdit, the first framework to actively reshape neural circuits before editing, thereby modulating interference between reasoning patterns and mitigating the trade-off. REdit integrates three components: (i) Contrastive Circuit Reshaping, which directly addresses the generality-locality trade-off by disentangling overlapping circuits; (ii) Meta-Contrastive Learning, which extends transferability to novel reasoning patterns; and (iii) Dual-Level Protection, which preserves preexisting abilities by constraining reshaping update directions and regularizing task-level predictions. Extensive experiments with Qwen-2.5-3B on propositional logic reasoning tasks across three difficulty levels demonstrate that REdit consistently achieves superior generality and locality compared to baselines, with additional validation in mathematics showing broader potential. Our code is available at https://github.com/LzyFischer/REdit.
Paper Structure (33 sections, 13 equations, 9 figures, 7 tables, 1 algorithm)

This paper contains 33 sections, 13 equations, 9 figures, 7 tables, 1 algorithm.

Figures (9)

  • Figure 1: LLM reasoning deficiencies and editing trade-off.
  • Figure 2: Correlation between circuit distance and interference. (a–c) Scatter plots with regression lines show that larger distances consistently correspond to reduced interference across different distance metrics. (d) Density plots of Pearson correlations confirm consistent negative associations.
  • Figure 3: Editing success rates across methods on ContextHub. REdit achieves success rates comparable to other approaches, confirming that it does not compromise the model’s fundamental editing capabilities.
  • Figure 4: Performance on unseen reasoning patterns after circuit reshaping with different ratios for training. REdit consistently outperforms baselines without reshaping.
  • Figure 5: Circuit–interference relationship before and after circuit reshaping. (a,b) Scatter plots of intra- and inter-pattern measurements show improved separability in interference and circuit distance. (c) Silhouette scores across reasoning patterns indicate consistent gains in cluster separation.
  • ...and 4 more figures

Theorems & Definitions (3)

  • Definition 1: Propositional-Logic (PL) Reasoning
  • Definition 2: Reasoning Pattern
  • Definition 3: Neural Approximation of PL