Table of Contents
Fetching ...

PFME: A Modular Approach for Fine-grained Hallucination Detection and Editing of Large Language Models

Kunquan Deng, Zeyu Huang, Chen Li, Chenghua Lin, Min Gao, Wenge Rong

TL;DR

This work presents PFME, a modular framework for fine-grained hallucination detection and editing in large language models, comprised of a Real-time Fact Retrieval module and a Progressive Fine-grained Detection and Editing module. By standardizing a six-type taxonomy of fine-grained hallucinations and leveraging external evidence, PFME attains superior detection accuracy (OA and Bi) and factual editing quality (FActScore) compared with strong baselines like FavaP, especially when paired with Llama3-8B-Instruct. Through extensive ablations on evidence quantity, retrieval level, and ranking methods, the paper offers practical guidelines for evidence-efficient editing and demonstrates the framework’s robustness across datasets such as FavaBench and FActScore. The approach has significant implications for deploying more faithful and reliable LLMs in real-world applications, with avenues for refining prompts and expanding evaluation benchmarks.

Abstract

Large Language Models (LLMs) excel in fluency but risk producing inaccurate content, called "hallucinations." This paper outlines a standardized process for categorizing fine-grained hallucination types and proposes an innovative framework--the Progressive Fine-grained Model Editor (PFME)--specifically designed to detect and correct fine-grained hallucinations in LLMs. PFME consists of two collaborative modules: the Real-time Fact Retrieval Module and the Fine-grained Hallucination Detection and Editing Module. The former identifies key entities in the document and retrieves the latest factual evidence from credible sources. The latter further segments the document into sentence-level text and, based on relevant evidence and previously edited context, identifies, locates, and edits each sentence's hallucination type. Experimental results on FavaBench and FActScore demonstrate that PFME outperforms existing methods in fine-grained hallucination detection tasks. Particularly, when using the Llama3-8B-Instruct model, PFME's performance in fine-grained hallucination detection with external knowledge assistance improves by 8.7 percentage points (pp) compared to ChatGPT. In editing tasks, PFME further enhances the FActScore of FActScore-Alpaca13B and FActScore-ChatGPT datasets, increasing by 16.2pp and 4.6pp, respectively.

PFME: A Modular Approach for Fine-grained Hallucination Detection and Editing of Large Language Models

TL;DR

This work presents PFME, a modular framework for fine-grained hallucination detection and editing in large language models, comprised of a Real-time Fact Retrieval module and a Progressive Fine-grained Detection and Editing module. By standardizing a six-type taxonomy of fine-grained hallucinations and leveraging external evidence, PFME attains superior detection accuracy (OA and Bi) and factual editing quality (FActScore) compared with strong baselines like FavaP, especially when paired with Llama3-8B-Instruct. Through extensive ablations on evidence quantity, retrieval level, and ranking methods, the paper offers practical guidelines for evidence-efficient editing and demonstrates the framework’s robustness across datasets such as FavaBench and FActScore. The approach has significant implications for deploying more faithful and reliable LLMs in real-world applications, with avenues for refining prompts and expanding evaluation benchmarks.

Abstract

Large Language Models (LLMs) excel in fluency but risk producing inaccurate content, called "hallucinations." This paper outlines a standardized process for categorizing fine-grained hallucination types and proposes an innovative framework--the Progressive Fine-grained Model Editor (PFME)--specifically designed to detect and correct fine-grained hallucinations in LLMs. PFME consists of two collaborative modules: the Real-time Fact Retrieval Module and the Fine-grained Hallucination Detection and Editing Module. The former identifies key entities in the document and retrieves the latest factual evidence from credible sources. The latter further segments the document into sentence-level text and, based on relevant evidence and previously edited context, identifies, locates, and edits each sentence's hallucination type. Experimental results on FavaBench and FActScore demonstrate that PFME outperforms existing methods in fine-grained hallucination detection tasks. Particularly, when using the Llama3-8B-Instruct model, PFME's performance in fine-grained hallucination detection with external knowledge assistance improves by 8.7 percentage points (pp) compared to ChatGPT. In editing tasks, PFME further enhances the FActScore of FActScore-Alpaca13B and FActScore-ChatGPT datasets, increasing by 16.2pp and 4.6pp, respectively.
Paper Structure (27 sections, 3 figures, 9 tables)

This paper contains 27 sections, 3 figures, 9 tables.

Figures (3)

  • Figure 1: Progressive Fine-grained Model Editor Architecture
  • Figure 2: Progressive Fine-grained Model Editor (PFME)'s Detection and Editing Module
  • Figure 3: Ablation: Similarity