Table of Contents
Fetching ...

Robust and Scalable Model Editing for Large Language Models

Yingfa Chen, Zhengyan Zhang, Xu Han, Chaojun Xiao, Zhiyuan Liu, Chen Chen, Kuai Li, Tao Yang, Maosong Sun

TL;DR

This work discovers that, with proper prompting methods, instruction-finetuned LLMs can be highly controllable by contextual knowledge and robust to irrelevant context, and proposes EREN (Edit models by REading Notes) to improve the scalability and robustness of LLM editing.

Abstract

Large language models (LLMs) can make predictions using parametric knowledge--knowledge encoded in the model weights--or contextual knowledge--knowledge presented in the context. In many scenarios, a desirable behavior is that LLMs give precedence to contextual knowledge when it conflicts with the parametric knowledge, and fall back to using their parametric knowledge when the context is irrelevant. This enables updating and correcting the model's knowledge by in-context editing instead of retraining. Previous works have shown that LLMs are inclined to ignore contextual knowledge and fail to reliably fall back to parametric knowledge when presented with irrelevant context. In this work, we discover that, with proper prompting methods, instruction-finetuned LLMs can be highly controllable by contextual knowledge and robust to irrelevant context. Utilizing this feature, we propose EREN (Edit models by REading Notes) to improve the scalability and robustness of LLM editing. To better evaluate the robustness of model editors, we collect a new dataset, that contains irrelevant questions that are more challenging than the ones in existing datasets. Empirical results show that our method outperforms current state-of-the-art methods by a large margin. Unlike existing techniques, it can integrate knowledge from multiple edits, and correctly respond to syntactically similar but semantically unrelated inputs (and vice versa). The source code can be found at https://github.com/thunlp/EREN.

Robust and Scalable Model Editing for Large Language Models

TL;DR

This work discovers that, with proper prompting methods, instruction-finetuned LLMs can be highly controllable by contextual knowledge and robust to irrelevant context, and proposes EREN (Edit models by REading Notes) to improve the scalability and robustness of LLM editing.

Abstract

Large language models (LLMs) can make predictions using parametric knowledge--knowledge encoded in the model weights--or contextual knowledge--knowledge presented in the context. In many scenarios, a desirable behavior is that LLMs give precedence to contextual knowledge when it conflicts with the parametric knowledge, and fall back to using their parametric knowledge when the context is irrelevant. This enables updating and correcting the model's knowledge by in-context editing instead of retraining. Previous works have shown that LLMs are inclined to ignore contextual knowledge and fail to reliably fall back to parametric knowledge when presented with irrelevant context. In this work, we discover that, with proper prompting methods, instruction-finetuned LLMs can be highly controllable by contextual knowledge and robust to irrelevant context. Utilizing this feature, we propose EREN (Edit models by REading Notes) to improve the scalability and robustness of LLM editing. To better evaluate the robustness of model editors, we collect a new dataset, that contains irrelevant questions that are more challenging than the ones in existing datasets. Empirical results show that our method outperforms current state-of-the-art methods by a large margin. Unlike existing techniques, it can integrate knowledge from multiple edits, and correctly respond to syntactically similar but semantically unrelated inputs (and vice versa). The source code can be found at https://github.com/thunlp/EREN.
Paper Structure (53 sections, 4 equations, 4 figures, 8 tables)

This paper contains 53 sections, 4 equations, 4 figures, 8 tables.

Figures (4)

  • Figure 1: Illustration of the framework of EREN. Two edits have been injected, and the colored part shows inference on two inputs. Green: Both edits are relevant, and the final output depends on both. Yellow: The LLM determines that no edit is relevant, and the output of the base model is used.
  • Figure 2: The performance of EREN on different base models. The red dotted line represents the EQ of SERAC + DA. T5 and T5 (IT) are the non-instruction-tuned and instruction-tuned versions of T5-XL, and GPT3.5 is the gpt-3.5-turbo API. See Appendix \ref{['appendix:different-base-models']} for more details.
  • Figure 3: Edit quality of EREN, ROME, and Full FT by different numbers of edits on CounterFact. The colored area is the standard deviation of 5 runs.
  • Figure 4: The performance of EREN and recall rate of the note retriever by retrieving different numbers of notes on CounterFact.