Novel-WD: Exploring acquisition of Novel World Knowledge in LLMs Using Prefix-Tuning

Maxime Méloux; Christophe Cerisara

Novel-WD: Exploring acquisition of Novel World Knowledge in LLMs Using Prefix-Tuning

Maxime Méloux, Christophe Cerisara

TL;DR

This work addresses how to update pre-trained LLMs with novel world knowledge by evaluating prefix-tuning as a compact knowledge store. It introduces Novel-WD, a Wikidata-derived dataset, and a dynamic benchmark to measure perplexity, knowledge acquisition, generalization, and MCQ performance for recently added facts. The study finds that a single, small prefix can encode individual facts and that both prefix length and, more strongly, prefix depth significantly influence learning and generalization, with larger base models offering additional accuracy gains. These results support using full-depth prefixes of moderate length as a practical approach for integrating up-to-date factual knowledge, while highlighting limitations and avenues for scaling via retrieval-augmented or routing-based methods. The released dataset and methodology provide a concrete foundation for systematic evaluation of fact learning in LLMs.

Abstract

Teaching new information to pre-trained large language models (PLM) is a crucial but challenging task. Model adaptation techniques, such as fine-tuning and parameter-efficient training have been shown to store new facts at a slow rate; continual learning is an option but is costly and prone to catastrophic forgetting. This work studies and quantifies how PLM may learn and remember new world knowledge facts that do not occur in their pre-training corpus, which only contains world knowledge up to a certain date. To that purpose, we first propose Novel-WD, a new dataset consisting of sentences containing novel facts extracted from recent Wikidata updates, along with two evaluation tasks in the form of causal language modeling and multiple choice questions (MCQ). We make this dataset freely available to the community, and release a procedure to later build new versions of similar datasets with up-to-date information. We also explore the use of prefix-tuning for novel information learning, and analyze how much information can be stored within a given prefix. We show that a single fact can reliably be encoded within a single prefix, and that the prefix capacity increases with its length and with the base model size.

Novel-WD: Exploring acquisition of Novel World Knowledge in LLMs Using Prefix-Tuning

TL;DR

Abstract

Paper Structure (24 sections, 5 figures, 7 tables)

This paper contains 24 sections, 5 figures, 7 tables.

Introduction
Related work
Methodology
Research questions
Facts learning
Evaluation
Dataset
Triple extraction
Training set
Two evaluation tasks
Experimental setup
Evaluation
Results and analysis
Base setup
Error analysis
...and 9 more sections

Figures (5)

Figure 1: Proposed approach: new facts are extracted from Wikidata, transformed into sentences with Vicuna-13b and trained into prefixes. We claim and show that this architecture is better than LoRA to capture novel knowledge.
Figure 2: Percentage of prefix-tuned models obtaining increased accuracy over the baseline. Error bars span 95% confidence intervals.
Figure 3: Training loss in the basic setup, measured post-training.
Figure 4: Mean accuracy of prefix-tuned (PT) models, LoRA models and of the baseline (right) in the prediction setting. Error bars span 95% confidence intervals.
Figure 5: Frobenius norm of the key and value vectors of the prefix in the basic setup, measured post-training.

Novel-WD: Exploring acquisition of Novel World Knowledge in LLMs Using Prefix-Tuning

TL;DR

Abstract

Novel-WD: Exploring acquisition of Novel World Knowledge in LLMs Using Prefix-Tuning

Authors

TL;DR

Abstract

Table of Contents

Figures (5)