CardioRAG: A Retrieval-Augmented Generation Framework for Multimodal Chagas Disease Detection
Zhengyang Shen, Xuehao Zhai, Hua Tu, Mayue Shi
TL;DR
The paper tackles Chagas disease screening in settings with limited serology by proposing CardioRAG, a retrieval-augmented генераtion framework that fuses interpretable ECG biomarkers with LLM-based diagnostic reasoning. It combines ECG feature extraction, VAE-based latent representations for case retrieval, and structured LLM outputs to deliver evidence-based diagnoses with confidence. Key findings show that a simplified prompt design and moderate, demographic-aware retrieval (k=8) yield the best balance of accuracy and recall, achieving up to $58.59%$ accuracy and $89.80%$ recall on a 100-patient test. This approach supports high-recall triage for serology in low-resource regions and demonstrates how clinical indicators can be embedded into trustworthy medical AI systems.
Abstract
Chagas disease affects nearly 6 million people worldwide, with Chagas cardiomyopathy representing its most severe complication. In regions where serological testing capacity is limited, AI-enhanced electrocardiogram (ECG) screening provides a critical diagnostic alternative. However, existing machine learning approaches face challenges such as limited accuracy, reliance on large labeled datasets, and more importantly, weak integration with evidence-based clinical diagnostic indicators. We propose a retrieval-augmented generation framework, CardioRAG, integrating large language models with interpretable ECG-based clinical features, including right bundle branch block, left anterior fascicular block, and heart rate variability metrics. The framework uses variational autoencoder-learned representations for semantic case retrieval, providing contextual cases to guide clinical reasoning. Evaluation demonstrated high recall performance of 89.80%, with a maximum F1 score of 0.68 for effective identification of positive cases requiring prioritized serological testing. CardioRAG provides an interpretable, clinical evidence-based approach particularly valuable for resource-limited settings, demonstrating a pathway for embedding clinical indicators into trustworthy medical AI systems.
