Table of Contents
Fetching ...

Collaborative decoding of critical tokens for boosting factuality of large language models

Lifeng Jin, Baolin Peng, Linfeng Song, Haitao Mi, Ye Tian, Dong Yu

TL;DR

The paper tackles factuality hallucination in aligned large language models by introducing critical tokens—tokens whose precise realization heavily impacts factual correctness. It proposes Collaborative Decoding Strategy (CDS), which blends the next-token distributions of a pretrained knowledge model and an aligned instruction-following model, guided by a critical-token classifier. A critical-token dataset is created and used to train the classifier, enabling Model CDS to route critical tokens to the pretrained model using greedy decoding while leaving noncritical tokens to the aligned model. Empirical results on TriviaQA, NaturalQuestions, and FActScore show substantial reductions in hallucinations with minimal loss of response diversity, including strong gains when mixing models of different sizes or even across families, demonstrating practical gains for improving factuality without extensive in-domain tuning.

Abstract

The most common training pipeline for large language models includes pretraining, finetuning and aligning phases, with their respective resulting models, such as the pretrained model and the finetuned model. Finetuned and aligned models show improved abilities of instruction following and safe generation, however their abilities to stay factual about the world are impacted by the finetuning process. Furthermore, the common practice of using sampling during generation also increases chances of hallucination. In this work, we introduce a collaborative decoding framework to harness the high factuality within pretrained models through the concept of critical tokens. We first design a critical token classifier to decide which model to use for the next token, and subsequently generates the next token using different decoding strategies. Experiments with different models and datasets show that our decoding framework is able to reduce model hallucination significantly, showcasing the importance of the collaborative decoding framework.

Collaborative decoding of critical tokens for boosting factuality of large language models

TL;DR

The paper tackles factuality hallucination in aligned large language models by introducing critical tokens—tokens whose precise realization heavily impacts factual correctness. It proposes Collaborative Decoding Strategy (CDS), which blends the next-token distributions of a pretrained knowledge model and an aligned instruction-following model, guided by a critical-token classifier. A critical-token dataset is created and used to train the classifier, enabling Model CDS to route critical tokens to the pretrained model using greedy decoding while leaving noncritical tokens to the aligned model. Empirical results on TriviaQA, NaturalQuestions, and FActScore show substantial reductions in hallucinations with minimal loss of response diversity, including strong gains when mixing models of different sizes or even across families, demonstrating practical gains for improving factuality without extensive in-domain tuning.

Abstract

The most common training pipeline for large language models includes pretraining, finetuning and aligning phases, with their respective resulting models, such as the pretrained model and the finetuned model. Finetuned and aligned models show improved abilities of instruction following and safe generation, however their abilities to stay factual about the world are impacted by the finetuning process. Furthermore, the common practice of using sampling during generation also increases chances of hallucination. In this work, we introduce a collaborative decoding framework to harness the high factuality within pretrained models through the concept of critical tokens. We first design a critical token classifier to decide which model to use for the next token, and subsequently generates the next token using different decoding strategies. Experiments with different models and datasets show that our decoding framework is able to reduce model hallucination significantly, showcasing the importance of the collaborative decoding framework.
Paper Structure (22 sections, 4 equations, 5 figures, 6 tables, 1 algorithm)

This paper contains 22 sections, 4 equations, 5 figures, 6 tables, 1 algorithm.

Figures (5)

  • Figure 1: Performance of different LLaMA-2 70B models on TriviaQA. Aligned denotes the Chat models.
  • Figure 2: A prompt with a factuality-related question and some model responses. The critical tokens shaded in green in the response include proper names, numbers and facts about the entity, which have low variance tolerance.
  • Figure 3: Dataset generation for critical tokens.
  • Figure 4: The response diversity of sampling at different temperatures. The dashed lines show the evaluation values for Model CDS. Llama 2 7B models are used.
  • Figure 5: The effect of number of shots for the pretrained model on performance of Natural Questions.