Large Model driven Radiology Report Generation with Clinical Quality Reinforcement Learning
Zijian Zhou, Miaojing Shi, Meng Wei, Oluwatosin Alabi, Zijie Yue, Tom Vercauteren
TL;DR
The paper addresses the challenge of generating radiology reports that align with clinical quality by introducing LM-RRG, a framework that combines an LLM-driven visual feature extractor, a multimodal report generator, and a clinical quality reinforcement learning loop guided by RadCliQ. The method leverages region-aware visual prompts produced via LLM descriptions, a multimodal decoder to auto-regressively generate reports, and PPO-style updates to optimize clinical relevance while maintaining alignment with ground-truth references. Empirical results on MIMIC-CXR and IU-Xray demonstrate state-of-the-art performance on MIMIC-CXR and competitive results on IU-Xray, with ablations confirming the contributions of each component and the value of RadCliQ-based reinforcement learning. This approach has the potential to improve the clinical accuracy of generated radiology reports and streamline radiologist workflows in real-world clinical settings.
Abstract
Radiology report generation (RRG) has attracted significant attention due to its potential to reduce the workload of radiologists. Current RRG approaches are still unsatisfactory against clinical standards. This paper introduces a novel RRG method, \textbf{LM-RRG}, that integrates large models (LMs) with clinical quality reinforcement learning to generate accurate and comprehensive chest X-ray radiology reports. Our method first designs a large language model driven feature extractor to analyze and interpret different regions of the chest X-ray image, emphasizing specific regions with medical significance. Next, based on the large model's decoder, we develop a multimodal report generator that leverages multimodal prompts from visual features and textual instruction to produce the radiology report in an auto-regressive way. Finally, to better reflect the clinical significant and insignificant errors that radiologists would normally assign in the report, we introduce a novel clinical quality reinforcement learning strategy. It utilizes the radiology report clinical quality (RadCliQ) metric as a reward function in the learning process. Extensive experiments on the MIMIC-CXR and IU-Xray datasets demonstrate the superiority of our method over the state of the art.
