Efficient Hallucination Detection: Adaptive Bayesian Estimation of Semantic Entropy with Guided Semantic Exploration

Qiyao Sun; Xingming Li; Xixiang He; Ao Cheng; Xuanyu Ji; Hailun Lu; Runke Huang; Qingyong Hu

Efficient Hallucination Detection: Adaptive Bayesian Estimation of Semantic Entropy with Guided Semantic Exploration

Qiyao Sun, Xingming Li, Xixiang He, Ao Cheng, Xuanyu Ji, Hailun Lu, Runke Huang, Qingyong Hu

Abstract

Large language models (LLMs) have achieved remarkable success in various natural language processing tasks, yet they remain prone to generating factually incorrect outputs known as hallucinations. While recent approaches have shown promise for hallucination detection by repeatedly sampling from LLMs and quantifying the semantic inconsistency among the generated responses, they rely on fixed sampling budgets that fail to adapt to query complexity, resulting in computational inefficiency. We propose an Adaptive Bayesian Estimation framework for Semantic Entropy with Guided Semantic Exploration, which dynamically adjusts sampling requirements based on observed uncertainty. Our approach employs a hierarchical Bayesian framework to model the semantic distribution, enabling dynamic control of sampling iterations through variance-based thresholds that terminate generation once sufficient certainty is achieved. We also develop a perturbation-based importance sampling strategy to systematically explore the semantic space. Extensive experiments on four QA datasets demonstrate that our method achieves superior hallucination detection performance with significant efficiency gains. In low-budget scenarios, our approach requires about 50% fewer samples to achieve comparable detection performance to existing methods, while delivers an average AUROC improvement of 12.6% under the same sampling budget.

Efficient Hallucination Detection: Adaptive Bayesian Estimation of Semantic Entropy with Guided Semantic Exploration

Abstract

Paper Structure (36 sections, 26 equations, 3 figures, 3 tables, 1 algorithm)

This paper contains 36 sections, 26 equations, 3 figures, 3 tables, 1 algorithm.

Introduction
Related Work
Hallucination
Hallucination Detection
External knowledge based methods
Metacognition based methods
Single-sample based methods
Multi-sample based methods
Problem Formulation
Language Model Generation
Semantic Equivalence
Semantic Entropy
Estimation Problem
Method
Hierarchical Bayesian Framework
...and 21 more sections

Figures (3)

Figure 1: Comparison of fixed sampling (a) versus our adaptive Bayesian approach (b) for hallucination detection. Fixed sampling wastes computational resources on simple queries (LLM1) while failing to discover semantic diversity in complex cases (LLM2). Our method dynamically adjusts sampling based on variance thresholds, enabling efficient and accurate hallucination detection.
Figure 2: AUROC performance comparison of hallucination detection methods on Llama-3.1-8B across varying sampling budgets N.
Figure 3: Distribution of actual sampling counts under a fixed average budget of N=5 across four QA datasets using Llama-3.1-8B.

Efficient Hallucination Detection: Adaptive Bayesian Estimation of Semantic Entropy with Guided Semantic Exploration

Abstract

Efficient Hallucination Detection: Adaptive Bayesian Estimation of Semantic Entropy with Guided Semantic Exploration

Authors

Abstract

Table of Contents

Figures (3)