Table of Contents
Fetching ...

Fine-Refine: Iterative Fine-grained Refinement for Mitigating Dialogue Hallucination

Xiangyan Chen, Yujian Gan, Matthew Purver

TL;DR

Fine-Refine is proposed, a fine-grained refinement framework that decomposes responses into atomic units, verifies each unit using external knowledge, assesses fluency via perplexity, assesses fluency via perplexity, and iteratively corrects granular errors.

Abstract

The tendency for hallucination in current large language models (LLMs) negatively impacts dialogue systems. Such hallucinations produce factually incorrect responses that may mislead users and undermine system trust. Existing refinement methods for dialogue systems typically operate at the response level, overlooking the fact that a single response may contain multiple verifiable or unverifiable facts. To address this gap, we propose Fine-Refine, a fine-grained refinement framework that decomposes responses into atomic units, verifies each unit using external knowledge, assesses fluency via perplexity, and iteratively corrects granular errors. We evaluate factuality across the HybriDialogue and OpendialKG datasets in terms of factual accuracy (fact score) and coverage (Not Enough Information Proportion), and experiments show that Fine-Refine substantially improves factuality, achieving up to a 7.63-point gain in dialogue fact score, with a small trade-off in dialogue quality.

Fine-Refine: Iterative Fine-grained Refinement for Mitigating Dialogue Hallucination

TL;DR

Fine-Refine is proposed, a fine-grained refinement framework that decomposes responses into atomic units, verifies each unit using external knowledge, assesses fluency via perplexity, assesses fluency via perplexity, and iteratively corrects granular errors.

Abstract

The tendency for hallucination in current large language models (LLMs) negatively impacts dialogue systems. Such hallucinations produce factually incorrect responses that may mislead users and undermine system trust. Existing refinement methods for dialogue systems typically operate at the response level, overlooking the fact that a single response may contain multiple verifiable or unverifiable facts. To address this gap, we propose Fine-Refine, a fine-grained refinement framework that decomposes responses into atomic units, verifies each unit using external knowledge, assesses fluency via perplexity, and iteratively corrects granular errors. We evaluate factuality across the HybriDialogue and OpendialKG datasets in terms of factual accuracy (fact score) and coverage (Not Enough Information Proportion), and experiments show that Fine-Refine substantially improves factuality, achieving up to a 7.63-point gain in dialogue fact score, with a small trade-off in dialogue quality.
Paper Structure (20 sections, 8 equations, 1 figure, 5 tables)

This paper contains 20 sections, 8 equations, 1 figure, 5 tables.

Figures (1)

  • Figure 1: Overall framework of the proposed fine-grained refinement. Starting from the dialogue, the original LLM-generated response contains several factually incorrect atomic units. The right part describes the assessment details. After assessing these atomic units, the LLM refines the response accordingly.