Table of Contents
Fetching ...

Time Sensitive Knowledge Editing through Efficient Finetuning

Xiou Ge, Ali Mousavi, Edouard Grave, Armand Joulin, Kun Qian, Benjamin Han, Mostafa Arefiyan, Yunyao Li

TL;DR

The paper tackles keeping time-sensitive factual knowledge in LLMs up to date, modeling edits as $(s, r, o) → (s, r, o')$ for modification and $(s, r, ∅) → (s, r, o')$ for injection. It argues that traditional locate-and-edit approaches face stability and scalability issues and introduces a fine-tuning objective $L_{FT} = \frac{1}{|D_M|} \sum_{d \in D_M} L(d; \Phi_0, \Delta \Phi)$ with frozen base weights $\Phi_0$ and learnable adapters $\Delta \Phi$. A large ChronoEdit dataset with ~15k time-sensitive edits is used to benchmark performance, and a layer-sweep reveals middle transformer layers are particularly influential for multi-hop QA. Empirical results show LoRA-based PEFT in particular, when applied to MLP (and optionally attention) layers, can match or exceed full fine-tuning performance with far fewer trainable parameters and better locality, demonstrating PEFT as a practical path for time-sensitive KE.

Abstract

Large Language Models (LLMs) have demonstrated impressive capability in different tasks and are bringing transformative changes to many domains. However, keeping the knowledge in LLMs up-to-date remains a challenge once pretraining is complete. It is thus essential to design effective methods to both update obsolete knowledge and induce new knowledge into LLMs. Existing locate-and-edit knowledge editing (KE) method suffers from two limitations. First, the post-edit LLMs by such methods generally have poor capability in answering complex queries that require multi-hop reasoning. Second, the long run-time of such locate-and-edit methods to perform knowledge edits make it infeasible for large scale KE in practice. In this paper, we explore Parameter-Efficient Fine-Tuning (PEFT) techniques as an alternative for KE. We curate a more comprehensive temporal KE dataset with both knowledge update and knowledge injection examples for KE performance benchmarking. We further probe the effect of fine-tuning on a range of layers in an LLM for the multi-hop QA task. We find that PEFT performs better than locate-and-edit techniques for time-sensitive knowledge edits.

Time Sensitive Knowledge Editing through Efficient Finetuning

TL;DR

The paper tackles keeping time-sensitive factual knowledge in LLMs up to date, modeling edits as for modification and for injection. It argues that traditional locate-and-edit approaches face stability and scalability issues and introduces a fine-tuning objective with frozen base weights and learnable adapters . A large ChronoEdit dataset with ~15k time-sensitive edits is used to benchmark performance, and a layer-sweep reveals middle transformer layers are particularly influential for multi-hop QA. Empirical results show LoRA-based PEFT in particular, when applied to MLP (and optionally attention) layers, can match or exceed full fine-tuning performance with far fewer trainable parameters and better locality, demonstrating PEFT as a practical path for time-sensitive KE.

Abstract

Large Language Models (LLMs) have demonstrated impressive capability in different tasks and are bringing transformative changes to many domains. However, keeping the knowledge in LLMs up-to-date remains a challenge once pretraining is complete. It is thus essential to design effective methods to both update obsolete knowledge and induce new knowledge into LLMs. Existing locate-and-edit knowledge editing (KE) method suffers from two limitations. First, the post-edit LLMs by such methods generally have poor capability in answering complex queries that require multi-hop reasoning. Second, the long run-time of such locate-and-edit methods to perform knowledge edits make it infeasible for large scale KE in practice. In this paper, we explore Parameter-Efficient Fine-Tuning (PEFT) techniques as an alternative for KE. We curate a more comprehensive temporal KE dataset with both knowledge update and knowledge injection examples for KE performance benchmarking. We further probe the effect of fine-tuning on a range of layers in an LLM for the multi-hop QA task. We find that PEFT performs better than locate-and-edit techniques for time-sensitive knowledge edits.
Paper Structure (8 sections, 2 equations, 8 figures, 10 tables)

This paper contains 8 sections, 2 equations, 8 figures, 10 tables.

Figures (8)

  • Figure 1: Who's the "current" head of the United Kingdom government?
  • Figure 2: Dataset statistics of ChronoEdit.
  • Figure 3: Reliability, Generalization, and Locality performance versus the number of edits on LLaMA-7B.
  • Figure 4: Performance of fine-tuning methods on the MQuAKE-T multi-hop dataset for LLaMA-7B.
  • Figure 5: Performance of LoRA at different ranks for the MQuAKE-T multi-hop dataset with LLaMA-7B.
  • ...and 3 more figures