Anatomically-Grounded Fact Checking of Automated Chest X-ray Reports
R. Mahmood, K. C. L. Wong, D. M. Reyes, N. D'Souza, L. Shi, J. Wu, P. Kaviani, M. Kalra, G. Wang, P. Yan, T. Syeda-Mahmood
TL;DR
The paper tackles hallucinations in automated chest X-ray report generation by proposing an anatomically-grounded fact-checking framework. It builds a synthetic image-FFL dataset and trains a multi-label cross-modal contrastive regression network to detect real versus fake findings and localize them to anatomical regions, enabling explainable error detection and LLM-assisted correction. Across multiple datasets and report generators, the method achieves strong real/fake and grounding performance and yields about a 40% improvement in corrected report quality. This approach offers a practical path toward safer, more reliable radiology report generation in clinical workflows, with potential applicability to broader medico-visual tasks.
Abstract
With the emergence of large-scale vision-language models, realistic radiology reports may be generated using only medical images as input guided by simple prompts. However, their practical utility has been limited due to the factual errors in their description of findings. In this paper, we propose a novel model for explainable fact-checking that identifies errors in findings and their locations indicated through the reports. Specifically, we analyze the types of errors made by automated reporting methods and derive a new synthetic dataset of images paired with real and fake descriptions of findings and their locations from a ground truth dataset. A new multi-label cross-modal contrastive regression network is then trained on this datsaset. We evaluate the resulting fact-checking model and its utility in correcting reports generated by several SOTA automated reporting tools on a variety of benchmark datasets with results pointing to over 40\% improvement in report quality through such error detection and correction.
