Table of Contents
Fetching ...

Language Modeling with Editable External Knowledge

Belinda Z. Li, Emmy Liu, Alexis Ross, Abbas Zeitoun, Graham Neubig, Jacob Andreas

TL;DR

ERASE improves model behavior when new documents are acquired, by incrementally deleting or rewriting other entries in the knowledge base each time a document is added.

Abstract

When the world changes, so does the text that humans write about it. How do we build language models that can be easily updated to reflect these changes? One popular approach is retrieval-augmented generation, in which new documents are inserted into a knowledge base and retrieved during prediction for downstream tasks. Most prior work on these systems have focused on improving behavior during prediction through better retrieval or reasoning. This paper introduces ERASE, which instead improves model behavior when new documents are acquired, by incrementally deleting or rewriting other entries in the knowledge base each time a document is added. In two new benchmark datasets evaluating models' ability to answer questions about a stream of news articles or conversations, ERASE improves accuracy relative to conventional retrieval-augmented generation by 7-13% (Mixtral-8x7B) and 6-10% (Llama-3-8B) absolute. Code and data are available at https://github.com/belindal/ERASE

Language Modeling with Editable External Knowledge

TL;DR

ERASE improves model behavior when new documents are acquired, by incrementally deleting or rewriting other entries in the knowledge base each time a document is added.

Abstract

When the world changes, so does the text that humans write about it. How do we build language models that can be easily updated to reflect these changes? One popular approach is retrieval-augmented generation, in which new documents are inserted into a knowledge base and retrieved during prediction for downstream tasks. Most prior work on these systems have focused on improving behavior during prediction through better retrieval or reasoning. This paper introduces ERASE, which instead improves model behavior when new documents are acquired, by incrementally deleting or rewriting other entries in the knowledge base each time a document is added. In two new benchmark datasets evaluating models' ability to answer questions about a stream of news articles or conversations, ERASE improves accuracy relative to conventional retrieval-augmented generation by 7-13% (Mixtral-8x7B) and 6-10% (Llama-3-8B) absolute. Code and data are available at https://github.com/belindal/ERASE
Paper Structure (55 sections, 4 equations, 7 figures, 9 tables)

This paper contains 55 sections, 4 equations, 7 figures, 9 tables.

Figures (7)

  • Figure 1: In standard retrieval augmented generation (RAG), new facts are simply added to an existing knowledge base $\mathcal{K}$. This can lead to stale facts in $\mathcal{K}$, which can in turn lead to incorrect predictions at inference time. In contrast, when erase reads a new input article, it not only adds new facts to $\mathcal{K}$, but also updates it. erase can edit or delete (not pictured) existing facts to keep $\mathcal{K}$ up to date, thereby enabling correct predictions at inference time. The same LM is used to update the memory and make predictions.
  • Figure 2: Overview of erase. We begin by retrieving existing facts relevant to input and prompting a LM to update them. We also extract facts from the input to add to our knowledge base.
  • Figure 3: Sample data from our datasets. The News dataset consists of factual questions whose answers change over time, with the associated source inducing that change. The Conversations dataset consists of conversations between two personas with evolving life facts. The single-hop subset directly states all facts that are changed, while the multi-hop subset requires reasoning about previous chunks of conversation to infer all changes.
  • Figure 4: Mixtral-8x7B (top) and Llama-3-8B (bottom) results on the news article domain. erase outperforms RAG, RAG with fact-level granularity, and even long-context models, especially in later timesteps as more new information is learned.
  • Figure 5: Screenshot of round 1 of annotation for news article.
  • ...and 2 more figures