Adaptive Token Biaser: Knowledge Editing via Biasing Key Entities

Baolong Bi; Shenghua Liu; Yiwei Wang; Lingrui Mei; Hongcheng Gao; Yilong Xu; Xueqi Cheng

Adaptive Token Biaser: Knowledge Editing via Biasing Key Entities

Baolong Bi, Shenghua Liu, Yiwei Wang, Lingrui Mei, Hongcheng Gao, Yilong Xu, Xueqi Cheng

TL;DR

Experimental results show that ATBias significantly enhances ICE performance, achieving up to a 32.3% improvement over state-of-the-art ICE methods while incurring only half the latency.

Abstract

The parametric knowledge memorized by large language models (LLMs) becomes outdated quickly. In-context editing (ICE) is currently the most effective method for updating the knowledge of LLMs. Recent advancements involve enhancing ICE by modifying the decoding strategy, obviating the need for altering internal model structures or adjusting external prompts. However, this enhancement operates across the entire sequence generation, encompassing a plethora of non-critical tokens. In this work, we introduce $\textbf{A}$daptive $\textbf{T}$oken $\textbf{Bias}$er ($\textbf{ATBias}$), a new decoding technique designed to enhance ICE. It focuses on the tokens that are mostly related to knowledge during decoding, biasing their logits by matching key entities related to new and parametric knowledge. Experimental results show that ATBias significantly enhances ICE performance, achieving up to a 32.3% improvement over state-of-the-art ICE methods while incurring only half the latency. ATBias not only improves the knowledge editing capabilities of ICE but can also be widely applied to LLMs with negligible cost.

Adaptive Token Biaser: Knowledge Editing via Biasing Key Entities

TL;DR

Experimental results show that ATBias significantly enhances ICE performance, achieving up to a 32.3% improvement over state-of-the-art ICE methods while incurring only half the latency.

Abstract

daptive

oken

er (

), a new decoding technique designed to enhance ICE. It focuses on the tokens that are mostly related to knowledge during decoding, biasing their logits by matching key entities related to new and parametric knowledge. Experimental results show that ATBias significantly enhances ICE performance, achieving up to a 32.3% improvement over state-of-the-art ICE methods while incurring only half the latency. ATBias not only improves the knowledge editing capabilities of ICE but can also be widely applied to LLMs with negligible cost.

Paper Structure (34 sections, 10 equations, 7 figures, 10 tables, 1 algorithm)

This paper contains 34 sections, 10 equations, 7 figures, 10 tables, 1 algorithm.

Introduction
Preliminary
LLMs Decoding.
Multi-hop Editing.
Methods
Parametric Induction & Entity Extraction
Probabilistic-Ranking Filter
N-gram and Jaccard Similarity
Adaptive Token Biaser
Knowledge Caching for Efficient Editing
Experiments
Experimental Setup
Tasks.
Datasets.
Models and Baselines.
...and 19 more sections

Figures (7)

Figure 1: A simple example of in-context editing (ICE). ICE successfully edits easy knowledge but fails to edit stubborn knowledge.
Figure 2: Illustration of how ATBIAS enhances ICE during decoding. ATBias adjusts the key token probabilities based on the similarity computed between filtered tokens and extracted new and parametric knowledge entities.
Figure 3: Probability (left) and ranking (right) statistics of new Knowledge for LLaMA2-7B-chat on stubborn > 33%. The probabilities are derived from normalize calculations.
Figure 4: An illustration of ATBias's easy deployment on MeLLo.
Figure 5: Ablation study results of the gram n for n-gram decomposition process.
...and 2 more figures

Adaptive Token Biaser: Knowledge Editing via Biasing Key Entities

TL;DR

Abstract

Adaptive Token Biaser: Knowledge Editing via Biasing Key Entities

Authors

TL;DR

Abstract

Table of Contents

Figures (7)