Table of Contents
Fetching ...

Consistency Is the Key: Detecting Hallucinations in LLM Generated Text By Checking Inconsistencies About Key Facts

Raavi Gupta, Pranav Hari Panicker, Sumit Bhatia, Ganesh Ramakrishnan

TL;DR

The paper tackles hallucinations in LLM-generated text under API-restricted settings by introducing ConFactCheck, a training-free detector that relies on the model's internal world knowledge. It employs a two-stage pipeline: (1) Fact Alignment Check, which extracts key facts via POS/NER, generates targeted questions with a finetuned T5, and uses an LLM-as-a-judge to compare original and regenerated facts; and (2) a Uniform Distribution Check, which applies a Kolmogorov–Smirnov test to the top-5 token distributions of regenerated facts to assess confidence. Empirically, ConFactCheck achieves strong or competitive AUC-PR performance across open-domain QA datasets and WikiBio, while reducing LLM calls and latency relative to self-check baselines; ablations show the uniformity check and decoding strategies meaningfully boost accuracy. The approach offers training-free deployability, interpretable explanations at the key-fact level, and practical impact for deploying hallucination detection in constrained environments where external knowledge and model fine-tuning are not feasible.

Abstract

Large language models (LLMs), despite their remarkable text generation capabilities, often hallucinate and generate text that is factually incorrect and not grounded in real-world knowledge. This poses serious risks in domains like healthcare, finance, and customer support. A typical way to use LLMs is via the APIs provided by LLM vendors where there is no access to model weights or options to fine-tune the model. Existing methods to detect hallucinations in such settings where the model access is restricted or constrained by resources typically require making multiple LLM API calls, increasing latency and API cost. We introduce CONFACTCHECK, an efficient hallucination detection approach that does not leverage any external knowledge base and works on the simple intuition that responses to factual probes within the generated text should be consistent within a single LLM and across different LLMs. Rigorous empirical evaluation on multiple datasets that cover both the generation of factual texts and the open generation shows that CONFACTCHECK can detect hallucinated facts efficiently using fewer resources and achieves higher accuracy scores compared to existing baselines that operate under similar conditions. Our code is available here.

Consistency Is the Key: Detecting Hallucinations in LLM Generated Text By Checking Inconsistencies About Key Facts

TL;DR

The paper tackles hallucinations in LLM-generated text under API-restricted settings by introducing ConFactCheck, a training-free detector that relies on the model's internal world knowledge. It employs a two-stage pipeline: (1) Fact Alignment Check, which extracts key facts via POS/NER, generates targeted questions with a finetuned T5, and uses an LLM-as-a-judge to compare original and regenerated facts; and (2) a Uniform Distribution Check, which applies a Kolmogorov–Smirnov test to the top-5 token distributions of regenerated facts to assess confidence. Empirically, ConFactCheck achieves strong or competitive AUC-PR performance across open-domain QA datasets and WikiBio, while reducing LLM calls and latency relative to self-check baselines; ablations show the uniformity check and decoding strategies meaningfully boost accuracy. The approach offers training-free deployability, interpretable explanations at the key-fact level, and practical impact for deploying hallucination detection in constrained environments where external knowledge and model fine-tuning are not feasible.

Abstract

Large language models (LLMs), despite their remarkable text generation capabilities, often hallucinate and generate text that is factually incorrect and not grounded in real-world knowledge. This poses serious risks in domains like healthcare, finance, and customer support. A typical way to use LLMs is via the APIs provided by LLM vendors where there is no access to model weights or options to fine-tune the model. Existing methods to detect hallucinations in such settings where the model access is restricted or constrained by resources typically require making multiple LLM API calls, increasing latency and API cost. We introduce CONFACTCHECK, an efficient hallucination detection approach that does not leverage any external knowledge base and works on the simple intuition that responses to factual probes within the generated text should be consistent within a single LLM and across different LLMs. Rigorous empirical evaluation on multiple datasets that cover both the generation of factual texts and the open generation shows that CONFACTCHECK can detect hallucinated facts efficiently using fewer resources and achieves higher accuracy scores compared to existing baselines that operate under similar conditions. Our code is available here.

Paper Structure

This paper contains 36 sections, 4 figures, 11 tables, 2 algorithms.

Figures (4)

  • Figure 1: Key fact-based hallucination detection through the Fact Alignment check of our ConFactCheck pipeline. Each fact is used to generate a question, and the fact is regenerated by prompting the question to the LLM. The regenerated facts are compared with the original extracted key facts to check for their consistency.
  • Figure 2: Pipeline of the ConFactCheck approach, with NER tagging of outputs followed by the first comparison-based check (Fact Alignment Check) and the secondary KS test-based probability check (Uniform Distribution Check) for rechecking the classfied non-hallucinations, result in the final tagging of hallucinations.
  • Figure 3: Prompting templates used for Fact Regeneration and Fact Alignment in the ConFactCheck pipeline. Note that the alignment prompt uses few-shot prompting.
  • Figure 4: Hypothetical step-by-step example explaining the methodology of ConFactCheck