Table of Contents
Fetching ...

Joint Knowledge Editing for Information Enrichment and Probability Promotion

Wenhang Shi, Yiren Chen, Shuqing Bian, Xinyi Zhang, Zhe Zhao, Pengfei Hu, Wei Lu, Xiaoyong Du

TL;DR

The work tackles the problem of updating knowledge in large language models without catastrophic forgetting by showing that editing should target distinct recall stages. A contrast-based probe reveals Information Enrichment in low layers and Probability Promotion in high layers, motivating a joint editing approach called JEEP that modifies both regions with adaptive, synergistic optimization. Across GPT-J (6B) and LLaMA (7B), on both factual and counterfactual tasks, JEEP consistently outperforms baselines in efficacy, generalization, and locality, demonstrating robust, scalable updates. The findings offer a practical pathway for efficient and reliable knowledge editing in real-world, rapidly changing information landscapes, with code and data released for reproducibility.

Abstract

Knowledge stored in large language models requires timely updates to reflect the dynamic nature of real-world information. To update the knowledge, most knowledge editing methods focus on the low layers, since recent probes into the knowledge recall process reveal that the answer information is enriched in low layers. However, these probes only and could only reveal critical recall stages for the original answers, while the goal of editing is to rectify model's prediction for the target answers. This inconsistency indicates that both the probe approaches and the associated editing methods are deficient. To mitigate the inconsistency and identify critical editing regions, we propose a contrast-based probe approach, and locate two crucial stages where the model behavior diverges between the original and target answers: Information Enrichment in low layers and Probability Promotion in high layers. Building upon the insights, we develop the Joint knowledge Editing for information Enrichment and probability Promotion (JEEP) method, which jointly edits both the low and high layers to modify the two critical recall stages. Considering the mutual interference and growing forgetting due to dual modifications, JEEP is designed to ensure that updates to distinct regions share the same objectives and are complementary. We rigorously evaluate JEEP by editing up to thousands of facts on various models, i.e., GPT-J (6B) and LLaMA (7B), and addressing diverse editing objectives, i.e., adding factual and counterfactual knowledge. In all tested scenarios, JEEP achieves best performances, validating the effectiveness of the revealings of our probe approach and the designs of our editing method. Our code and data are available at https://github.com/Eric8932/JEEP.

Joint Knowledge Editing for Information Enrichment and Probability Promotion

TL;DR

The work tackles the problem of updating knowledge in large language models without catastrophic forgetting by showing that editing should target distinct recall stages. A contrast-based probe reveals Information Enrichment in low layers and Probability Promotion in high layers, motivating a joint editing approach called JEEP that modifies both regions with adaptive, synergistic optimization. Across GPT-J (6B) and LLaMA (7B), on both factual and counterfactual tasks, JEEP consistently outperforms baselines in efficacy, generalization, and locality, demonstrating robust, scalable updates. The findings offer a practical pathway for efficient and reliable knowledge editing in real-world, rapidly changing information landscapes, with code and data released for reproducibility.

Abstract

Knowledge stored in large language models requires timely updates to reflect the dynamic nature of real-world information. To update the knowledge, most knowledge editing methods focus on the low layers, since recent probes into the knowledge recall process reveal that the answer information is enriched in low layers. However, these probes only and could only reveal critical recall stages for the original answers, while the goal of editing is to rectify model's prediction for the target answers. This inconsistency indicates that both the probe approaches and the associated editing methods are deficient. To mitigate the inconsistency and identify critical editing regions, we propose a contrast-based probe approach, and locate two crucial stages where the model behavior diverges between the original and target answers: Information Enrichment in low layers and Probability Promotion in high layers. Building upon the insights, we develop the Joint knowledge Editing for information Enrichment and probability Promotion (JEEP) method, which jointly edits both the low and high layers to modify the two critical recall stages. Considering the mutual interference and growing forgetting due to dual modifications, JEEP is designed to ensure that updates to distinct regions share the same objectives and are complementary. We rigorously evaluate JEEP by editing up to thousands of facts on various models, i.e., GPT-J (6B) and LLaMA (7B), and addressing diverse editing objectives, i.e., adding factual and counterfactual knowledge. In all tested scenarios, JEEP achieves best performances, validating the effectiveness of the revealings of our probe approach and the designs of our editing method. Our code and data are available at https://github.com/Eric8932/JEEP.

Paper Structure

This paper contains 39 sections, 18 equations, 8 figures, 4 tables.

Figures (8)

  • Figure 1: Using probability as information indicator, we directly observe the original answer's information flow within the model. By further contrasting information flow of the original and target answers, we identify two critical recall stages for knowledge editing: Information Enrichment in low layers and Probability Promotion in high layers.
  • Figure 2: Original and Target answers' information change in low, middle and high layers, indicated by rank and probability (Prob). The top and bottom graphs are for Subject Last and Prediction positions respectively.
  • Figure 3: Procedure of JEEP method. Firstly, it computes $\delta'$ and $\delta^{*}$ for low and high layers simultaneously, optimizing for injecting new knowledge. Secondly, it uses the residual errors of $v_{i'}^{L'}$ to update the low MLP layers. Finally, it uses the residual errors of $v_{i^{*}}^{L^{*}}$ to update the high MLP layers.
  • Figure 4: Scaling curves plot performance change against different editing numbers (log-scale) on LLaMA (7B). 95% confidence intervals are shown as areas.
  • Figure 5: Information change of different sets at the last prediction position, indicated by their average probability in the representations.
  • ...and 3 more figures