Table of Contents
Fetching ...

ReXTrust: A Model for Fine-Grained Hallucination Detection in AI-Generated Radiology Reports

Romain Hardy, Sung Eun Kim, Du Hyun Ro, Pranav Rajpurkar

TL;DR

ReXTrust tackles the critical problem of hallucinations in AI-generated radiology reports by introducing a white-box detector that analyzes sequences of LVLM hidden states with a self-attention module to yield finding-level hallucination risk scores. Trained on MIMIC-CXR with GPT-4o entailment-based labels and evaluated via 5-fold CV, it demonstrates strong discriminative power, achieving AUROC values of $0.8751$ overall and $0.8963$ on clinically significant findings. The method outperforms black-box and other white-box baselines, with additional gains from self-attention and a complementary ensemble with RadFlag. The work highlights the value of hidden-state signals for improving safety and reliability in medical AI report generation, and discusses generalizability to other architectures, limitations of supervision, and avenues for future improvements such as better visual grounding.

Abstract

The increasing adoption of AI-generated radiology reports necessitates robust methods for detecting hallucinations--false or unfounded statements that could impact patient care. We present ReXTrust, a novel framework for fine-grained hallucination detection in AI-generated radiology reports. Our approach leverages sequences of hidden states from large vision-language models to produce finding-level hallucination risk scores. We evaluate ReXTrust on a subset of the MIMIC-CXR dataset and demonstrate superior performance compared to existing approaches, achieving an AUROC of 0.8751 across all findings and 0.8963 on clinically significant findings. Our results show that white-box approaches leveraging model hidden states can provide reliable hallucination detection for medical AI systems, potentially improving the safety and reliability of automated radiology reporting.

ReXTrust: A Model for Fine-Grained Hallucination Detection in AI-Generated Radiology Reports

TL;DR

ReXTrust tackles the critical problem of hallucinations in AI-generated radiology reports by introducing a white-box detector that analyzes sequences of LVLM hidden states with a self-attention module to yield finding-level hallucination risk scores. Trained on MIMIC-CXR with GPT-4o entailment-based labels and evaluated via 5-fold CV, it demonstrates strong discriminative power, achieving AUROC values of overall and on clinically significant findings. The method outperforms black-box and other white-box baselines, with additional gains from self-attention and a complementary ensemble with RadFlag. The work highlights the value of hidden-state signals for improving safety and reliability in medical AI report generation, and discusses generalizability to other architectures, limitations of supervision, and avenues for future improvements such as better visual grounding.

Abstract

The increasing adoption of AI-generated radiology reports necessitates robust methods for detecting hallucinations--false or unfounded statements that could impact patient care. We present ReXTrust, a novel framework for fine-grained hallucination detection in AI-generated radiology reports. Our approach leverages sequences of hidden states from large vision-language models to produce finding-level hallucination risk scores. We evaluate ReXTrust on a subset of the MIMIC-CXR dataset and demonstrate superior performance compared to existing approaches, achieving an AUROC of 0.8751 across all findings and 0.8963 on clinically significant findings. Our results show that white-box approaches leveraging model hidden states can provide reliable hallucination detection for medical AI systems, potentially improving the safety and reliability of automated radiology reporting.

Paper Structure

This paper contains 27 sections, 5 figures, 3 tables.

Figures (5)

  • Figure 1: ReXTrust hallucination detection framework. The model processes MedVersa's hidden states $h_i$ through a self-attention module to produce finding-level hallucination scores.
  • Figure 2: ReXTrust AUGRC performance on five medical categories, compared to RadFlag.
  • Figure 3: Qualitative examples of ReXTrust on four studies from our held-out evaluation set. Shown on the left are examples of chest X-rays serving as inputs to MedVersa. Shown in the middle are findings generated by MedVersa, as well as token-level attention visualizations from ReXTrust. Shown on the right are the finding-level risk scores output by the classification head of ReXTrust.
  • Figure 4: Prompt template for radiological finding severity classification. The prompt provides detailed definitions and examples for each severity category to ensure consistent classification across all experiments.
  • Figure 5: AUROC and AUPRC performance of ReXTrust as a function of the layer index of the MedVersa hidden states used as inputs. Model performance saturates near layer 16. Error bars indicate 95% confidence intervals.