Table of Contents
Fetching ...

Enhancing Uncertainty Modeling with Semantic Graph for Hallucination Detection

Kedi Chen, Qin Chen, Jie Zhou, Xinqi Tao, Bowen Ding, Jingwen Xie, Mingchen Xie, Peilong Li, Feng Zheng, Liang He

TL;DR

This paper tackles hallucination in large language models by moving beyond token-level uncertainty to a semantic-graph framework that captures entity relations and cross-sentence dependencies. It builds a passage-level semantic graph using AMR parsing and entity linking, then introduces relation-based uncertainty propagation and graph-based uncertainty calibration to improve sentence- and passage-level detection. The approach yields state-of-the-art results on WikiBio and the Chinese NoteSum dataset, including a 19.78% improvement in passage-level detection, and demonstrates strong cross-domain generalization. The work advances practical hallucination detection by integrating structured semantic information with uncertainty estimation, with implications for safer, more reliable LLM deployments and potential integration with external knowledge sources for fact-checking.

Abstract

Large Language Models (LLMs) are prone to hallucination with non-factual or unfaithful statements, which undermines the applications in real-world scenarios. Recent researches focus on uncertainty-based hallucination detection, which utilizes the output probability of LLMs for uncertainty calculation and does not rely on external knowledge or frequent sampling from LLMs. Whereas, most approaches merely consider the uncertainty of each independent token, while the intricate semantic relations among tokens and sentences are not well studied, which limits the detection of hallucination that spans over multiple tokens and sentences in the passage. In this paper, we propose a method to enhance uncertainty modeling with semantic graph for hallucination detection. Specifically, we first construct a semantic graph that well captures the relations among entity tokens and sentences. Then, we incorporate the relations between two entities for uncertainty propagation to enhance sentence-level hallucination detection. Given that hallucination occurs due to the conflict between sentences, we further present a graph-based uncertainty calibration method that integrates the contradiction probability of the sentence with its neighbors in the semantic graph for uncertainty calculation. Extensive experiments on two datasets show the great advantages of our proposed approach. In particular, we obtain substantial improvements with 19.78% in passage-level hallucination detection.

Enhancing Uncertainty Modeling with Semantic Graph for Hallucination Detection

TL;DR

This paper tackles hallucination in large language models by moving beyond token-level uncertainty to a semantic-graph framework that captures entity relations and cross-sentence dependencies. It builds a passage-level semantic graph using AMR parsing and entity linking, then introduces relation-based uncertainty propagation and graph-based uncertainty calibration to improve sentence- and passage-level detection. The approach yields state-of-the-art results on WikiBio and the Chinese NoteSum dataset, including a 19.78% improvement in passage-level detection, and demonstrates strong cross-domain generalization. The work advances practical hallucination detection by integrating structured semantic information with uncertainty estimation, with implications for safer, more reliable LLM deployments and potential integration with external knowledge sources for fact-checking.

Abstract

Large Language Models (LLMs) are prone to hallucination with non-factual or unfaithful statements, which undermines the applications in real-world scenarios. Recent researches focus on uncertainty-based hallucination detection, which utilizes the output probability of LLMs for uncertainty calculation and does not rely on external knowledge or frequent sampling from LLMs. Whereas, most approaches merely consider the uncertainty of each independent token, while the intricate semantic relations among tokens and sentences are not well studied, which limits the detection of hallucination that spans over multiple tokens and sentences in the passage. In this paper, we propose a method to enhance uncertainty modeling with semantic graph for hallucination detection. Specifically, we first construct a semantic graph that well captures the relations among entity tokens and sentences. Then, we incorporate the relations between two entities for uncertainty propagation to enhance sentence-level hallucination detection. Given that hallucination occurs due to the conflict between sentences, we further present a graph-based uncertainty calibration method that integrates the contradiction probability of the sentence with its neighbors in the semantic graph for uncertainty calculation. Extensive experiments on two datasets show the great advantages of our proposed approach. In particular, we obtain substantial improvements with 19.78% in passage-level hallucination detection.
Paper Structure (46 sections, 8 equations, 5 figures, 9 tables)

This paper contains 46 sections, 8 equations, 5 figures, 9 tables.

Figures (5)

  • Figure 1: (a) Previous works only concern independent tokens and use their average scores as the metrics, resulting in errors in sentence and passage-level detection. (b) Our method captures more complex semantic dependencies with a semantic graph for uncertainty modeling, such as the relations between entities, and the relations with neighbor sentences in the passage-level semantic graph.
  • Figure 2: The overview of our approach for hallucination detection. For token-level uncertainty, we integrate the maximum and variance of the probabilities, along with a sequence decay term. Regarding to sentence-level uncertainty, we interpolate the sum of entity uncertainty through relation-based propagation and global uncertainty via quantile. Finally, we incorporate the relations of neighbor sentences in the semantic graph with graph-based uncertainty calibration for passage-level uncertainty.
  • Figure 3: The uncertainty scores of three types of samples calculated with FOCUS and ours.
  • Figure 4: Visualization of the entity uncertainty and global uncertainty for three types of samples.
  • Figure 5: The Pearson and Spearman metrics of ours and the compared methods for passage-level uncertainty calculation.