Table of Contents
Fetching ...

MAGDA: Multi-agent guideline-driven diagnostic assistance

David Bani-Harouni, Nassir Navab, Matthias Keicher

TL;DR

MAGDA introduces a three-agent, guideline-driven zero-shot diagnostic framework for chest X-rays that unites clinical guidelines with dynamic vision-language prompting and chain-of-thought reasoning. By screening images with a CLIP-based module guided by disease-specific descriptors, reasoning over findings with an LLM, and refining predictions with inter-disease dependencies, MAGDA achieves state-of-the-art zero-shot performance on CheXpert and competitive results on ChestXRay14 Longtail, particularly for rare diseases. The approach emphasizes explainability through explicit reasoning and maintains adaptability without fine-tuning, enabling rapid deployment in resource-limited settings. Practical impact lies in closer alignment with medical guidelines, improved trust through transparent reasoning, and applicability to understaffed clinical environments where radiologist access is limited.

Abstract

In emergency departments, rural hospitals, or clinics in less developed regions, clinicians often lack fast image analysis by trained radiologists, which can have a detrimental effect on patients' healthcare. Large Language Models (LLMs) have the potential to alleviate some pressure from these clinicians by providing insights that can help them in their decision-making. While these LLMs achieve high test results on medical exams showcasing their great theoretical medical knowledge, they tend not to follow medical guidelines. In this work, we introduce a new approach for zero-shot guideline-driven decision support. We model a system of multiple LLM agents augmented with a contrastive vision-language model that collaborate to reach a patient diagnosis. After providing the agents with simple diagnostic guidelines, they will synthesize prompts and screen the image for findings following these guidelines. Finally, they provide understandable chain-of-thought reasoning for their diagnosis, which is then self-refined to consider inter-dependencies between diseases. As our method is zero-shot, it is adaptable to settings with rare diseases, where training data is limited, but expert-crafted disease descriptions are available. We evaluate our method on two chest X-ray datasets, CheXpert and ChestX-ray 14 Longtail, showcasing performance improvement over existing zero-shot methods and generalizability to rare diseases.

MAGDA: Multi-agent guideline-driven diagnostic assistance

TL;DR

MAGDA introduces a three-agent, guideline-driven zero-shot diagnostic framework for chest X-rays that unites clinical guidelines with dynamic vision-language prompting and chain-of-thought reasoning. By screening images with a CLIP-based module guided by disease-specific descriptors, reasoning over findings with an LLM, and refining predictions with inter-disease dependencies, MAGDA achieves state-of-the-art zero-shot performance on CheXpert and competitive results on ChestXRay14 Longtail, particularly for rare diseases. The approach emphasizes explainability through explicit reasoning and maintains adaptability without fine-tuning, enabling rapid deployment in resource-limited settings. Practical impact lies in closer alignment with medical guidelines, improved trust through transparent reasoning, and applicability to understaffed clinical environments where radiologist access is limited.

Abstract

In emergency departments, rural hospitals, or clinics in less developed regions, clinicians often lack fast image analysis by trained radiologists, which can have a detrimental effect on patients' healthcare. Large Language Models (LLMs) have the potential to alleviate some pressure from these clinicians by providing insights that can help them in their decision-making. While these LLMs achieve high test results on medical exams showcasing their great theoretical medical knowledge, they tend not to follow medical guidelines. In this work, we introduce a new approach for zero-shot guideline-driven decision support. We model a system of multiple LLM agents augmented with a contrastive vision-language model that collaborate to reach a patient diagnosis. After providing the agents with simple diagnostic guidelines, they will synthesize prompts and screen the image for findings following these guidelines. Finally, they provide understandable chain-of-thought reasoning for their diagnosis, which is then self-refined to consider inter-dependencies between diseases. As our method is zero-shot, it is adaptable to settings with rare diseases, where training data is limited, but expert-crafted disease descriptions are available. We evaluate our method on two chest X-ray datasets, CheXpert and ChestX-ray 14 Longtail, showcasing performance improvement over existing zero-shot methods and generalizability to rare diseases.
Paper Structure (14 sections, 3 equations, 2 figures, 5 tables)

This paper contains 14 sections, 3 equations, 2 figures, 5 tables.

Figures (2)

  • Figure 1: Schematic overview of the proposed method MAGDA.
  • Figure 2: A qualitative example of the model reasoning.