Table of Contents
Fetching ...

MRGAgents: A Multi-Agent Framework for Improved Medical Report Generation with Med-LVLMs

Pengyu Wang, Shuchang Ye, Usman Naseem, Jinman Kim

TL;DR

The paper tackles the bias of Medical Large Vision-Language Models toward predicting normal findings and incomplete coverage in radiology reports. It introduces MRGAgents, a multi-agent framework where 13 disease-specific agents, aligned to CheXbert labels, are fine-tuned on curated disease subsets and their outputs are aggregated into a structured report. The approach demonstrates improved diagnostic utility and report comprehensiveness, outperforming single-model baselines like BioMedGPT on IU X-ray and MIMIC-CXR, particularly in recall and CIDEr-based metrics, and achieves significant disease-level detection gains. This work advances automated radiology reporting by enabling detailed, clinically relevant descriptions across disease categories, with visualization studies highlighting practical improvements and remaining gaps such as describing non-disease findings like implants.

Abstract

Medical Large Vision-Language Models (Med-LVLMs) have been widely adopted for medical report generation. Despite Med-LVLMs producing state-of-the-art performance, they exhibit a bias toward predicting all findings as normal, leading to reports that overlook critical abnormalities. Furthermore, these models often fail to provide comprehensive descriptions of radiologically relevant regions necessary for accurate diagnosis. To address these challenges, we proposeMedical Report Generation Agents (MRGAgents), a novel multi-agent framework that fine-tunes specialized agents for different disease categories. By curating subsets of the IU X-ray and MIMIC-CXR datasets to train disease-specific agents, MRGAgents generates reports that more effectively balance normal and abnormal findings while ensuring a comprehensive description of clinically relevant regions. Our experiments demonstrate that MRGAgents outperformed the state-of-the-art, improving both report comprehensiveness and diagnostic utility.

MRGAgents: A Multi-Agent Framework for Improved Medical Report Generation with Med-LVLMs

TL;DR

The paper tackles the bias of Medical Large Vision-Language Models toward predicting normal findings and incomplete coverage in radiology reports. It introduces MRGAgents, a multi-agent framework where 13 disease-specific agents, aligned to CheXbert labels, are fine-tuned on curated disease subsets and their outputs are aggregated into a structured report. The approach demonstrates improved diagnostic utility and report comprehensiveness, outperforming single-model baselines like BioMedGPT on IU X-ray and MIMIC-CXR, particularly in recall and CIDEr-based metrics, and achieves significant disease-level detection gains. This work advances automated radiology reporting by enabling detailed, clinically relevant descriptions across disease categories, with visualization studies highlighting practical improvements and remaining gaps such as describing non-disease findings like implants.

Abstract

Medical Large Vision-Language Models (Med-LVLMs) have been widely adopted for medical report generation. Despite Med-LVLMs producing state-of-the-art performance, they exhibit a bias toward predicting all findings as normal, leading to reports that overlook critical abnormalities. Furthermore, these models often fail to provide comprehensive descriptions of radiologically relevant regions necessary for accurate diagnosis. To address these challenges, we proposeMedical Report Generation Agents (MRGAgents), a novel multi-agent framework that fine-tunes specialized agents for different disease categories. By curating subsets of the IU X-ray and MIMIC-CXR datasets to train disease-specific agents, MRGAgents generates reports that more effectively balance normal and abnormal findings while ensuring a comprehensive description of clinically relevant regions. Our experiments demonstrate that MRGAgents outperformed the state-of-the-art, improving both report comprehensiveness and diagnostic utility.

Paper Structure

This paper contains 14 sections, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Overall Framework of MRGAgents
  • Figure 2: The distribution of positive and negative sentences in each disease.
  • Figure 3: Examples of generated reports, with different text colors highlighting various medical descriptions for comparison with the Ground Truth. All reports generated by MRGAgents consist of 13 sentences, each corresponding to a specific disease category.