Table of Contents
Fetching ...

Automated Structured Radiology Report Generation with Rich Clinical Context

Seongjae Kang, Dong Bok Lee, Juho Jung, Dongseop Kim, Won Hwa Kim, Sunghoon Joo

TL;DR

This paper tackles the problem of automated structured radiology report generation (SRRG) lacking clinical context, which leads to temporal hallucinations. It proposes contextualized SRRG (C-SRRG) and builds the C-SRRG dataset by integrating multi-view X-rays, indication, technique, and prior studies, enabling richer, longitudinal reasoning. Across SoTA medical MLLMs (CheXagent-3B, MedGemma-4B, Lingshu-7B), incorporating clinical context consistently improves report quality (F1-SRR-BERT, Category Score) and mitigates temporal hallucinations, with larger models deriving greater benefits. The work includes extensive ablations, analyzes context components, and releases dataset, code, and model checkpoints to advance clinically-aligned automated radiology reporting, highlighting the necessity of context-aware generation in large multimodal models for radiology applications.

Abstract

Automated structured radiology report generation (SRRG) from chest X-ray images offers significant potential to reduce workload of radiologists by generating reports in structured formats that ensure clarity, consistency, and adherence to clinical reporting standards. While radiologists effectively utilize available clinical contexts in their diagnostic reasoning, existing SRRG systems overlook these essential elements. This fundamental gap leads to critical problems including temporal hallucinations when referencing non-existent clinical contexts. To address these limitations, we propose contextualized SRRG (C-SRRG) that comprehensively incorporates rich clinical context for SRRG. We curate C-SRRG dataset by integrating comprehensive clinical context encompassing 1) multi-view X-ray images, 2) clinical indication, 3) imaging techniques, and 4) prior studies with corresponding comparisons based on patient histories. Through extensive benchmarking with state-of-the-art multimodal large language models, we demonstrate that incorporating clinical context with the proposed C-SRRG significantly improves report generation quality. We publicly release dataset, code, and checkpoints to facilitate future research for clinically-aligned automated RRG at https://github.com/vuno/contextualized-srrg.

Automated Structured Radiology Report Generation with Rich Clinical Context

TL;DR

This paper tackles the problem of automated structured radiology report generation (SRRG) lacking clinical context, which leads to temporal hallucinations. It proposes contextualized SRRG (C-SRRG) and builds the C-SRRG dataset by integrating multi-view X-rays, indication, technique, and prior studies, enabling richer, longitudinal reasoning. Across SoTA medical MLLMs (CheXagent-3B, MedGemma-4B, Lingshu-7B), incorporating clinical context consistently improves report quality (F1-SRR-BERT, Category Score) and mitigates temporal hallucinations, with larger models deriving greater benefits. The work includes extensive ablations, analyzes context components, and releases dataset, code, and model checkpoints to advance clinically-aligned automated radiology reporting, highlighting the necessity of context-aware generation in large multimodal models for radiology applications.

Abstract

Automated structured radiology report generation (SRRG) from chest X-ray images offers significant potential to reduce workload of radiologists by generating reports in structured formats that ensure clarity, consistency, and adherence to clinical reporting standards. While radiologists effectively utilize available clinical contexts in their diagnostic reasoning, existing SRRG systems overlook these essential elements. This fundamental gap leads to critical problems including temporal hallucinations when referencing non-existent clinical contexts. To address these limitations, we propose contextualized SRRG (C-SRRG) that comprehensively incorporates rich clinical context for SRRG. We curate C-SRRG dataset by integrating comprehensive clinical context encompassing 1) multi-view X-ray images, 2) clinical indication, 3) imaging techniques, and 4) prior studies with corresponding comparisons based on patient histories. Through extensive benchmarking with state-of-the-art multimodal large language models, we demonstrate that incorporating clinical context with the proposed C-SRRG significantly improves report generation quality. We publicly release dataset, code, and checkpoints to facilitate future research for clinically-aligned automated RRG at https://github.com/vuno/contextualized-srrg.

Paper Structure

This paper contains 42 sections, 30 figures, 11 tables.

Figures (30)

  • Figure 1: Clinical context consistently and significantly improves medical MLLMs—including CheXagent-3B chen2024chexagent, MedGemma-4B sellergren2025medgemma, and Lingshu-7B xu2025lingshu—on both the findings and impression tasks for SRRG, as measured by F1-SRR-BERT metric delbrouck2025automated. Clinical context becomes increasingly critical as MLLMs scale up, highlighting its importance in RRG.
  • Figure 2: A conceptual illustration of the proposed C-SRRG. (a) Radiologists routinely use clinical context, while (b) existing SRRG frameworks do not. Motivated by this gap, (c) C-SRRG leverages multi-view images, indication, technique, and variable-length prior studies/comparisons to generate structured radiology reports.
  • Figure 3: An example of temporal hallucinations. This report contains "new from prior exam" even though any prior studies are not provided. Please see examples of full structured reports in \ref{['fig:hallucination_example1', 'fig:hallucination_example2', 'fig:hallucination_example3']}.
  • Figure 4: Available proportion of clinical context for each split in findings and impression.
  • Figure 5: Distribution of the number of prior studies available per sample for findings and impression.
  • ...and 25 more figures