Table of Contents
Fetching ...

Large Model driven Radiology Report Generation with Clinical Quality Reinforcement Learning

Zijian Zhou, Miaojing Shi, Meng Wei, Oluwatosin Alabi, Zijie Yue, Tom Vercauteren

TL;DR

The paper addresses the challenge of generating radiology reports that align with clinical quality by introducing LM-RRG, a framework that combines an LLM-driven visual feature extractor, a multimodal report generator, and a clinical quality reinforcement learning loop guided by RadCliQ. The method leverages region-aware visual prompts produced via LLM descriptions, a multimodal decoder to auto-regressively generate reports, and PPO-style updates to optimize clinical relevance while maintaining alignment with ground-truth references. Empirical results on MIMIC-CXR and IU-Xray demonstrate state-of-the-art performance on MIMIC-CXR and competitive results on IU-Xray, with ablations confirming the contributions of each component and the value of RadCliQ-based reinforcement learning. This approach has the potential to improve the clinical accuracy of generated radiology reports and streamline radiologist workflows in real-world clinical settings.

Abstract

Radiology report generation (RRG) has attracted significant attention due to its potential to reduce the workload of radiologists. Current RRG approaches are still unsatisfactory against clinical standards. This paper introduces a novel RRG method, \textbf{LM-RRG}, that integrates large models (LMs) with clinical quality reinforcement learning to generate accurate and comprehensive chest X-ray radiology reports. Our method first designs a large language model driven feature extractor to analyze and interpret different regions of the chest X-ray image, emphasizing specific regions with medical significance. Next, based on the large model's decoder, we develop a multimodal report generator that leverages multimodal prompts from visual features and textual instruction to produce the radiology report in an auto-regressive way. Finally, to better reflect the clinical significant and insignificant errors that radiologists would normally assign in the report, we introduce a novel clinical quality reinforcement learning strategy. It utilizes the radiology report clinical quality (RadCliQ) metric as a reward function in the learning process. Extensive experiments on the MIMIC-CXR and IU-Xray datasets demonstrate the superiority of our method over the state of the art.

Large Model driven Radiology Report Generation with Clinical Quality Reinforcement Learning

TL;DR

The paper addresses the challenge of generating radiology reports that align with clinical quality by introducing LM-RRG, a framework that combines an LLM-driven visual feature extractor, a multimodal report generator, and a clinical quality reinforcement learning loop guided by RadCliQ. The method leverages region-aware visual prompts produced via LLM descriptions, a multimodal decoder to auto-regressively generate reports, and PPO-style updates to optimize clinical relevance while maintaining alignment with ground-truth references. Empirical results on MIMIC-CXR and IU-Xray demonstrate state-of-the-art performance on MIMIC-CXR and competitive results on IU-Xray, with ablations confirming the contributions of each component and the value of RadCliQ-based reinforcement learning. This approach has the potential to improve the clinical accuracy of generated radiology reports and streamline radiologist workflows in real-world clinical settings.

Abstract

Radiology report generation (RRG) has attracted significant attention due to its potential to reduce the workload of radiologists. Current RRG approaches are still unsatisfactory against clinical standards. This paper introduces a novel RRG method, \textbf{LM-RRG}, that integrates large models (LMs) with clinical quality reinforcement learning to generate accurate and comprehensive chest X-ray radiology reports. Our method first designs a large language model driven feature extractor to analyze and interpret different regions of the chest X-ray image, emphasizing specific regions with medical significance. Next, based on the large model's decoder, we develop a multimodal report generator that leverages multimodal prompts from visual features and textual instruction to produce the radiology report in an auto-regressive way. Finally, to better reflect the clinical significant and insignificant errors that radiologists would normally assign in the report, we introduce a novel clinical quality reinforcement learning strategy. It utilizes the radiology report clinical quality (RadCliQ) metric as a reward function in the learning process. Extensive experiments on the MIMIC-CXR and IU-Xray datasets demonstrate the superiority of our method over the state of the art.
Paper Structure (12 sections, 5 equations, 2 figures, 3 tables)

This paper contains 12 sections, 5 equations, 2 figures, 3 tables.

Figures (2)

  • Figure 1: The overall framework of our LM-RRG, which comprises three components: LLM-driven visual feature extractor, multimodal report generator and clinical quality reinforcement learning (CQRL).
  • Figure 2: Two examples of our generated reports, alongside their input radiology images and ground truth reports. Critical findings in the radiology images are roughly labelled (denoted in coloured boxes); The clinical significant/insignificant observations are highlighted in both reports; a) Illustration of the ground truth report, including critical findings and the impression annotated by radiologists. b) Report generated by our LM-CQRL. Our reports accurately identify the regions of abnormalities and closely align with the ground truth.