Table of Contents
Fetching ...

AutoRG-Brain: Grounded Report Generation for Brain MRI

Jiayu Lei, Xiaoman Zhang, Chaoyi Wu, Lisong Dai, Ya Zhang, Yanyong Zhang, Yanfeng Wang, Weidi Xie, Yuehua Li

TL;DR

AutoRG-Brain tackles the burden of radiology report generation for brain MRI by decomposing the task into ROI-grounded segmentation and region-guided report generation. It introduces a two-stage pipeline and the RadGenome-Brain MRI dataset, achieving pixel-level grounding and strong performance on segmentation and grounded reporting, validated by automatic metrics and human evaluation, including real-clinical deployment. The approach leverages self-supervised and semi-supervised training on large-scale partially labeled data and demonstrates meaningful improvements for junior radiologists in clinical workflows. The work is open-sourced to accelerate grounded report generation research in medical imaging.

Abstract

Radiologists are tasked with interpreting a large number of images in a daily base, with the responsibility of generating corresponding reports. This demanding workload elevates the risk of human error, potentially leading to treatment delays, increased healthcare costs, revenue loss, and operational inefficiencies. To address these challenges, we initiate a series of work on grounded Automatic Report Generation (AutoRG), starting from the brain MRI interpretation system, which supports the delineation of brain structures, the localization of anomalies, and the generation of well-organized findings. We make contributions from the following aspects, first, on dataset construction, we release a comprehensive dataset encompassing segmentation masks of anomaly regions and manually authored reports, termed as RadGenome-Brain MRI. This data resource is intended to catalyze ongoing research and development in the field of AI-assisted report generation systems. Second, on system design, we propose AutoRG-Brain, the first brain MRI report generation system with pixel-level grounded visual clues. Third, for evaluation, we conduct quantitative assessments and human evaluations of brain structure segmentation, anomaly localization, and report generation tasks to provide evidence of its reliability and accuracy. This system has been integrated into real clinical scenarios, where radiologists were instructed to write reports based on our generated findings and anomaly segmentation masks. The results demonstrate that our system enhances the report-writing skills of junior doctors, aligning their performance more closely with senior doctors, thereby boosting overall productivity.

AutoRG-Brain: Grounded Report Generation for Brain MRI

TL;DR

AutoRG-Brain tackles the burden of radiology report generation for brain MRI by decomposing the task into ROI-grounded segmentation and region-guided report generation. It introduces a two-stage pipeline and the RadGenome-Brain MRI dataset, achieving pixel-level grounding and strong performance on segmentation and grounded reporting, validated by automatic metrics and human evaluation, including real-clinical deployment. The approach leverages self-supervised and semi-supervised training on large-scale partially labeled data and demonstrates meaningful improvements for junior radiologists in clinical workflows. The work is open-sourced to accelerate grounded report generation research in medical imaging.

Abstract

Radiologists are tasked with interpreting a large number of images in a daily base, with the responsibility of generating corresponding reports. This demanding workload elevates the risk of human error, potentially leading to treatment delays, increased healthcare costs, revenue loss, and operational inefficiencies. To address these challenges, we initiate a series of work on grounded Automatic Report Generation (AutoRG), starting from the brain MRI interpretation system, which supports the delineation of brain structures, the localization of anomalies, and the generation of well-organized findings. We make contributions from the following aspects, first, on dataset construction, we release a comprehensive dataset encompassing segmentation masks of anomaly regions and manually authored reports, termed as RadGenome-Brain MRI. This data resource is intended to catalyze ongoing research and development in the field of AI-assisted report generation systems. Second, on system design, we propose AutoRG-Brain, the first brain MRI report generation system with pixel-level grounded visual clues. Third, for evaluation, we conduct quantitative assessments and human evaluations of brain structure segmentation, anomaly localization, and report generation tasks to provide evidence of its reliability and accuracy. This system has been integrated into real clinical scenarios, where radiologists were instructed to write reports based on our generated findings and anomaly segmentation masks. The results demonstrate that our system enhances the report-writing skills of junior doctors, aligning their performance more closely with senior doctors, thereby boosting overall productivity.
Paper Structure (30 sections, 8 equations, 19 figures, 6 tables)

This paper contains 30 sections, 8 equations, 19 figures, 6 tables.

Figures (19)

  • Figure 1: Overview of our contributions. a An analogy between the radiologists and the pipeline of our proposed system AutoRG-Brain on report generation. b The proposed grounded report generation dataset (RadGenome-Brain MRI), contains 3,408 region-report pairs, covering five diseases and six MRI modalities. c The result of AutoRG-Brain advancing radiologist report writing efficiency, making the report quality closer to the gold standard report written by senior doctors.
  • Figure 1: Examples of abnormal segmentation of Sim2Real and Ours-S1, our segmentation module after the first self-supervised training stage, on the BraTS2021 and ISLES2022 datasets.
  • Figure 2: Comparison of our second-stage segmentation module (Ours-S2) with SOTA segmentation backbones and brain registration models on brain structure segmentation for multi-modal brain MRIs with real anomalies. We evaluate these models on four distinct datasets, BraTS2021, ISLES2022, RP3D-Brain, and SSPH. Due to the absence of ground truth brain structure segmentation on those datasets, we present the results based on human rankings on the left, where lower rankings indicate better outcomes. Note the baseline nnU-Net is not the same structure as our segmentation module, since we modify the original nnU-Net to fit our problem scenario.
  • Figure 2: The annotation page for human evaluation of the generated report.
  • Figure 3: The global report generated by AutoRG-Brain-AutoSeg on SSPH is evaluated by three radiologists in four dimensions: (i) Omission of finding; (ii) False prediction of finding; (iii) Incorrect location/position of finding; (iv) Incorrect description of lesion. The scores of four dimensions are shown on the left with the overall score shown on the right. The scoring is on a 4-point scale: 0 indicates more than three clinically significant errors or omissions, 1 indicates three clinically significant errors or omissions, 2 indicates two clinically significant errors or omissions, 3 indicates one clinically significant error or omission, and 4 indicates no clinically significant errors or omissions.
  • ...and 14 more figures