Radiology Workflow-Guided Hierarchical Reinforcement Fine-Tuning for Medical Report Generation

Bodong Du; Honglong Yang; Xiaomeng Li

Radiology Workflow-Guided Hierarchical Reinforcement Fine-Tuning for Medical Report Generation

Bodong Du, Honglong Yang, Xiaomeng Li

TL;DR

RadFlow tackles the problem of misalignment between descriptive findings and diagnostic impressions in medical report generation by introducing a hierarchical reinforcement-fine-tuning framework that mirrors radiologists' workflow. It decomposes rewards into a global component for fluent, clinically faithful Findings and cross-sectional consistency and a local component for Impression accuracy, augmented by Target Exploration and a critical-aware policy optimization (CAPO) that tightens updates for high-stakes cases. The approach is backed by theoretical guarantees on policy stability and demonstrated through extensive experiments on carotid ultrasound and chest X-ray datasets, where RadFlow achieves superior diagnostic coherence and overall report quality over state-of-the-art baselines. The work highlights a promising direction for incorporating structured clinical reasoning into end-to-end learning, with potential extensions to more modalities and human-in-the-loop feedback to further improve reliability and safety in clinical reporting.

Abstract

Radiologists compose diagnostic reports through a structured workflow: they describe visual findings, summarize them into impressions, and carefully refine statements in clinically critical cases. However, most existing medical report generation (MRG) systems treat reports as flat sequences, overlooking this hierarchical organization and leading to inconsistencies between descriptive and diagnostic content. To align model behavior with real-world reporting practices, we propose RadFlow, a hierarchical workflow-guided reinforcement optimization framework that explicitly models the structured nature of clinical reporting. RadFlow introduces a clinically grounded reward hierarchy that mirrors the organization of radiological reports. At the global level, the reward integrates linguistic fluency, medical-domain correctness, and cross-sectional consistency between Finding and Impression, promoting coherent and clinically faithful narratives. At the local level, a section-specific reward emphasizes Impression quality, reflecting its central role in diagnostic accuracy. Furthermore, a critical-aware policy optimization mechanism adaptively regularizes learning for high-risk or clinically sensitive cases, emulating the cautious refinement behavior of radiologists when documenting critical findings. Together, these components translate the structured reporting paradigm into the reinforcement fine-tuning process, enabling the model to generate reports that are both linguistically consistent and clinically aligned. Experiments on chest X-ray and carotid ultrasound datasets demonstrate that RadFlow consistently improves diagnostic coherence and overall report quality compared with state-of-the-art baselines.

Radiology Workflow-Guided Hierarchical Reinforcement Fine-Tuning for Medical Report Generation

TL;DR

Abstract

Radiology Workflow-Guided Hierarchical Reinforcement Fine-Tuning for Medical Report Generation

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (3)