Automated Structured Radiology Report Generation with Rich Clinical Context
Seongjae Kang, Dong Bok Lee, Juho Jung, Dongseop Kim, Won Hwa Kim, Sunghoon Joo
TL;DR
This paper tackles the problem of automated structured radiology report generation (SRRG) lacking clinical context, which leads to temporal hallucinations. It proposes contextualized SRRG (C-SRRG) and builds the C-SRRG dataset by integrating multi-view X-rays, indication, technique, and prior studies, enabling richer, longitudinal reasoning. Across SoTA medical MLLMs (CheXagent-3B, MedGemma-4B, Lingshu-7B), incorporating clinical context consistently improves report quality (F1-SRR-BERT, Category Score) and mitigates temporal hallucinations, with larger models deriving greater benefits. The work includes extensive ablations, analyzes context components, and releases dataset, code, and model checkpoints to advance clinically-aligned automated radiology reporting, highlighting the necessity of context-aware generation in large multimodal models for radiology applications.
Abstract
Automated structured radiology report generation (SRRG) from chest X-ray images offers significant potential to reduce workload of radiologists by generating reports in structured formats that ensure clarity, consistency, and adherence to clinical reporting standards. While radiologists effectively utilize available clinical contexts in their diagnostic reasoning, existing SRRG systems overlook these essential elements. This fundamental gap leads to critical problems including temporal hallucinations when referencing non-existent clinical contexts. To address these limitations, we propose contextualized SRRG (C-SRRG) that comprehensively incorporates rich clinical context for SRRG. We curate C-SRRG dataset by integrating comprehensive clinical context encompassing 1) multi-view X-ray images, 2) clinical indication, 3) imaging techniques, and 4) prior studies with corresponding comparisons based on patient histories. Through extensive benchmarking with state-of-the-art multimodal large language models, we demonstrate that incorporating clinical context with the proposed C-SRRG significantly improves report generation quality. We publicly release dataset, code, and checkpoints to facilitate future research for clinically-aligned automated RRG at https://github.com/vuno/contextualized-srrg.
