Table of Contents
Fetching ...

Zodiac: A Cardiologist-Level LLM Framework for Multi-Agent Diagnostics

Yuan Zhou, Peng Zhang, Mengya Song, Alice Zheng, Yiwen Lu, Zhiheng Liu, Yong Chen, Zhaohan Xi

TL;DR

Zodiac addresses the gap of cardiologist-level professionalism in LLM-based diagnostics by introducing a cardiologist-grade, multi-agent framework that processes multimodal patient data through specialized LLMs. It combines data-driven professionalism with instruction tuning, in-context learning, and guideline-based fact-checking, validated on real-world ECG data and eight clinical metrics. The approach demonstrates superior performance over industry-leading models and medical-specialist LLMs at sub-30B scales, while integrating into Software-as-Medical-Device workflows via AWS-based deployment. This work provides a practical blueprint for building clinical-grade, human-in-the-loop LLMs in cardiology and beyond, underscoring the potential for AI-assisted, safer, and more efficient patient care.

Abstract

Large language models (LLMs) have demonstrated remarkable progress in healthcare. However, a significant gap remains regarding LLMs' professionalism in domain-specific clinical practices, limiting their application in real-world diagnostics. In this work, we introduce ZODIAC, an LLM-powered framework with cardiologist-level professionalism designed to engage LLMs in cardiological diagnostics. ZODIAC assists cardiologists by extracting clinically relevant characteristics from patient data, detecting significant arrhythmias, and generating preliminary reports for the review and refinement by cardiologists. To achieve cardiologist-level professionalism, ZODIAC is built on a multi-agent collaboration framework, enabling the processing of patient data across multiple modalities. Each LLM agent is fine-tuned using real-world patient data adjudicated by cardiologists, reinforcing the model's professionalism. ZODIAC undergoes rigorous clinical validation with independent cardiologists, evaluated across eight metrics that measure clinical effectiveness and address security concerns. Results show that ZODIAC outperforms industry-leading models, including OpenAI's GPT-4o, Meta's Llama-3.1-405B, and Google's Gemini-pro, as well as medical-specialist LLMs like Microsoft's BioGPT. ZODIAC demonstrates the transformative potential of specialized LLMs in healthcare by delivering domain-specific solutions that meet the stringent demands of medical practice. Notably, ZODIAC has been successfully integrated into electrocardiography (ECG) devices, exemplifying the growing trend of embedding LLMs into Software-as-Medical-Device (SaMD).

Zodiac: A Cardiologist-Level LLM Framework for Multi-Agent Diagnostics

TL;DR

Zodiac addresses the gap of cardiologist-level professionalism in LLM-based diagnostics by introducing a cardiologist-grade, multi-agent framework that processes multimodal patient data through specialized LLMs. It combines data-driven professionalism with instruction tuning, in-context learning, and guideline-based fact-checking, validated on real-world ECG data and eight clinical metrics. The approach demonstrates superior performance over industry-leading models and medical-specialist LLMs at sub-30B scales, while integrating into Software-as-Medical-Device workflows via AWS-based deployment. This work provides a practical blueprint for building clinical-grade, human-in-the-loop LLMs in cardiology and beyond, underscoring the potential for AI-assisted, safer, and more efficient patient care.

Abstract

Large language models (LLMs) have demonstrated remarkable progress in healthcare. However, a significant gap remains regarding LLMs' professionalism in domain-specific clinical practices, limiting their application in real-world diagnostics. In this work, we introduce ZODIAC, an LLM-powered framework with cardiologist-level professionalism designed to engage LLMs in cardiological diagnostics. ZODIAC assists cardiologists by extracting clinically relevant characteristics from patient data, detecting significant arrhythmias, and generating preliminary reports for the review and refinement by cardiologists. To achieve cardiologist-level professionalism, ZODIAC is built on a multi-agent collaboration framework, enabling the processing of patient data across multiple modalities. Each LLM agent is fine-tuned using real-world patient data adjudicated by cardiologists, reinforcing the model's professionalism. ZODIAC undergoes rigorous clinical validation with independent cardiologists, evaluated across eight metrics that measure clinical effectiveness and address security concerns. Results show that ZODIAC outperforms industry-leading models, including OpenAI's GPT-4o, Meta's Llama-3.1-405B, and Google's Gemini-pro, as well as medical-specialist LLMs like Microsoft's BioGPT. ZODIAC demonstrates the transformative potential of specialized LLMs in healthcare by delivering domain-specific solutions that meet the stringent demands of medical practice. Notably, ZODIAC has been successfully integrated into electrocardiography (ECG) devices, exemplifying the growing trend of embedding LLMs into Software-as-Medical-Device (SaMD).
Paper Structure (23 sections, 2 equations, 14 figures, 2 tables)

This paper contains 23 sections, 2 equations, 14 figures, 2 tables.

Figures (14)

  • Figure 1: Zodiac attains cardiologist-level professionalism through a combination of advanced data integration with sophisticated technical designs.
  • Figure 2: Zodiac aligns with cardiological practice through a multi-agent framework that integrates patient data across various modalities: ➀ Patient data is collected in two modalities: tabular metrics and ECG tracings (images). ➁ A metrics-to-findings LLM agent processes the tabular metrics and generates text-based clinical findings. ➂ An tracings-to-findings LLM agent analyzes the ECG tracings to produce additional text-based clinical findings. ➃ The clinical findings from both agents are then combined. ➄ A findings-to-interpretation LLM agent synthesizes these findings with clinical guidelines into comprehensive diagnostic interpretation. ➅ Zodiac generates a patient-specific report by integrating the metrics, tracings, clinical findings, and diagnostic interpretation. ➆ A cardiologist validates the quality of the generated findings and interpretations (details in \ref{['sec:expt']}). For simplicity, we omit the biostatistics ($\mathcal{B}$) in this figure, which is considered in steps ➀➁➂ by default.
  • Figure 3: (a)-(c) illustrate the prompts used for $\theta_\texttt{M2F}$ (prompts for $\theta_\texttt{T2F}$ and $\theta_\texttt{F2I}$ are in Figure \ref{['fig:prompt_more']}): (a) represents the instructions (or "system prompt") used for both fine-tuning and inference; (b) includes the demonstrations used for in-context learning during inference; and (c) shows the input and response structures. During fine-tuning, (c) is filled with cardiologist-adjudicated texts, whereas during inference, (c) retains the format presented above to specify the response format. (d) presents the statistics of our collected patient data, which is further subgrouped by gender, age, and arrhythmia classes -- Class I: normal arrhythmias. Class II: clinically significant arrhythmias. Class III: life-threatening arrhythmias. Detailed clinical implications are provided in Appendix \ref{['ap:arrhythima_class']}.
  • Figure 4: Workflow of Zodiac assisting cardiologists through AWS deployment.
  • Figure 5: An example of interpretation generated by Zodiac.
  • ...and 9 more figures