Table of Contents
Fetching ...

KnowHalu: Hallucination Detection via Multi-Form Knowledge Based Factual Checking

Jiawei Zhang, Chejian Xu, Yu Gai, Freddy Lecue, Dawn Song, Bo Li

TL;DR

KnowHalu tackles LLM hallucinations with a two-phase approach: first detecting non-fabrication hallucinations, then performing multi-form factual checking that decomposes queries and leverages both unstructured and structured knowledge. The method integrates stepwise reasoning, targeted knowledge retrieval, knowledge optimization, and a fusion-based aggregation to produce robust judgments. Across QA and summarization benchmarks on HaluEval, KnowHalu achieves substantial gains over state-of-the-art baselines, and analyses reveal the value of query formulation, retrieval depth, and multi-form knowledge. The framework demonstrates versatility and provides insights into how external knowledge and structured reasoning can dramatically improve factual reliability in LLM outputs, with potential extensions to dialogue and longer responses.

Abstract

This paper introduces KnowHalu, a novel approach for detecting hallucinations in text generated by large language models (LLMs), utilizing step-wise reasoning, multi-formulation query, multi-form knowledge for factual checking, and fusion-based detection mechanism. As LLMs are increasingly applied across various domains, ensuring that their outputs are not hallucinated is critical. Recognizing the limitations of existing approaches that either rely on the self-consistency check of LLMs or perform post-hoc fact-checking without considering the complexity of queries or the form of knowledge, KnowHalu proposes a two-phase process for hallucination detection. In the first phase, it identifies non-fabrication hallucinations--responses that, while factually correct, are irrelevant or non-specific to the query. The second phase, multi-form based factual checking, contains five key steps: reasoning and query decomposition, knowledge retrieval, knowledge optimization, judgment generation, and judgment aggregation. Our extensive evaluations demonstrate that KnowHalu significantly outperforms SOTA baselines in detecting hallucinations across diverse tasks, e.g., improving by 15.65% in QA tasks and 5.50% in summarization tasks, highlighting its effectiveness and versatility in detecting hallucinations in LLM-generated content.

KnowHalu: Hallucination Detection via Multi-Form Knowledge Based Factual Checking

TL;DR

KnowHalu tackles LLM hallucinations with a two-phase approach: first detecting non-fabrication hallucinations, then performing multi-form factual checking that decomposes queries and leverages both unstructured and structured knowledge. The method integrates stepwise reasoning, targeted knowledge retrieval, knowledge optimization, and a fusion-based aggregation to produce robust judgments. Across QA and summarization benchmarks on HaluEval, KnowHalu achieves substantial gains over state-of-the-art baselines, and analyses reveal the value of query formulation, retrieval depth, and multi-form knowledge. The framework demonstrates versatility and provides insights into how external knowledge and structured reasoning can dramatically improve factual reliability in LLM outputs, with potential extensions to dialogue and longer responses.

Abstract

This paper introduces KnowHalu, a novel approach for detecting hallucinations in text generated by large language models (LLMs), utilizing step-wise reasoning, multi-formulation query, multi-form knowledge for factual checking, and fusion-based detection mechanism. As LLMs are increasingly applied across various domains, ensuring that their outputs are not hallucinated is critical. Recognizing the limitations of existing approaches that either rely on the self-consistency check of LLMs or perform post-hoc fact-checking without considering the complexity of queries or the form of knowledge, KnowHalu proposes a two-phase process for hallucination detection. In the first phase, it identifies non-fabrication hallucinations--responses that, while factually correct, are irrelevant or non-specific to the query. The second phase, multi-form based factual checking, contains five key steps: reasoning and query decomposition, knowledge retrieval, knowledge optimization, judgment generation, and judgment aggregation. Our extensive evaluations demonstrate that KnowHalu significantly outperforms SOTA baselines in detecting hallucinations across diverse tasks, e.g., improving by 15.65% in QA tasks and 5.50% in summarization tasks, highlighting its effectiveness and versatility in detecting hallucinations in LLM-generated content.
Paper Structure (28 sections, 1 figure, 19 tables, 1 algorithm)

This paper contains 28 sections, 1 figure, 19 tables, 1 algorithm.

Figures (1)

  • Figure 1: Overview of KnowHalu. The hallucination detection process starts with "Non-Fabrication Hallucination Checking", a phase focusing on the early identification of non-fabrication hallucinations by scrutinizing the specificity of the answers. For potential fabrication hallucinations, KnowHalu then provides a comprehensive "Factual Checking", which consists of five steps: (a) "Step-wise Reasoning and Query" breaks down the original query into step-wise reasoning and sub-queries for detailed factual checking; (b) "Knowledge Retrieval" retrieves unstructured knowledge via RAG and structured knowledge in the form of triplets for each sub-query; (c) "Knowledge Optimization" leverages LLMs to summarize and refine the retrieved knowledge into different forms; (d) "Judgment Based on Multi-form Knowledge" employs LLMs to critically assesses the answer to sub-queries, based on each form of knowledge; (e) "Aggregation" provides a further refined judgment by aggregating predictions based on different forms of knowledge.