Uncovering Knowledge Gaps in Radiology Report Generation Models through Knowledge Graphs
Xiaoman Zhang, Julián N. Acosta, Hong-Yu Zhou, Pranav Rajpurkar
TL;DR
This work introduces ReXKG, a knowledge-graph framework that converts radiology reports into structured graphs to evaluate AI-generated radiology reports beyond surface-level text. It defines six entity types and three relations, builds nodes and edges through UMLS mapping and semantic merging, and introduces three metrics—ReXKG-NSC, ReXKG-AMS, and ReXKG-SCS—for node similarity, edge distribution, and subgraph coverage. Through experiments on CheXpert Plus and MIMIC-CXR (with cross-modality demonstrations via CT-RATE and MIMIC-IV Head CT), the study shows generalist models extend entity and concept coverage more than specialists but still fall short of radiologist-level depth, particularly for devices and quantified measurements, and exhibit occasional longitudinal hallucinations. The results underscore the value of a KG-based evaluation to uncover knowledge gaps and guide development toward clinically aligned, longitudinal, multimodal radiology reporting systems.
Abstract
Recent advancements in artificial intelligence have significantly improved the automatic generation of radiology reports. However, existing evaluation methods fail to reveal the models' understanding of radiological images and their capacity to achieve human-level granularity in descriptions. To bridge this gap, we introduce a system, named ReXKG, which extracts structured information from processed reports to construct a comprehensive radiology knowledge graph. We then propose three metrics to evaluate the similarity of nodes (ReXKG-NSC), distribution of edges (ReXKG-AMS), and coverage of subgraphs (ReXKG-SCS) across various knowledge graphs. We conduct an in-depth comparative analysis of AI-generated and human-written radiology reports, assessing the performance of both specialist and generalist models. Our study provides a deeper understanding of the capabilities and limitations of current AI models in radiology report generation, offering valuable insights for improving model performance and clinical applicability.
