Table of Contents
Fetching ...

Detection of Chagas Disease from the ECG: The George B. Moody PhysioNet Challenge 2025

Matthew A. Reyna, Zuzana Koscova, Jan Pavlus, Soheil Saghafi, James Weigle, Andoni Elola, Salman Seyedi, Kiersten Campbell, Qiao Li, Ali Bahrami Rad, Antônio H. Ribeiro, Antonio Luiz P. Ribeiro, Reza Sameni, Gari D. Clifford

TL;DR

The paper presents the George B. Moody PhysioNet Challenge 2025, which tasks teams with developing open-source algorithms to identify Chagas disease from standard 12-lead ECGs under a real-world testing-capacity constraint. It leverages a large, multi-source dataset with both weak (self-reported) and strong (serologically validated) labels and introduces data augmentation and WFDB-format preprocessing to enhance generalization. A novel constrained ranking metric evaluates true positives within the top $M$ referrals, reflecting limited serological testing capacity in endemic regions and guiding practical triage performance. Results show substantial participation and promising triage capabilities, but reveal notable generalization gaps across unseen cohorts, underscoring challenges in deploying ECG-based screening for Chagas disease at scale.

Abstract

Objective: Chagas disease is a parasitic infection that is endemic to South America, Central America, and, more recently, the U.S., primarily transmitted by insects. Chronic Chagas disease can cause cardiovascular diseases and digestive problems. Serological testing capacities for Chagas disease are limited, but Chagas cardiomyopathy often manifests in ECGs, providing an opportunity to prioritize patients for testing and treatment. Approach: The George B. Moody PhysioNet Challenge 2025 invites teams to develop algorithmic approaches for identifying Chagas disease from electrocardiograms (ECGs). Main results: This Challenge provides multiple innovations. First, we leveraged several datasets with labels from patient reports and serological testing, provided a large dataset with weak labels and smaller datasets with strong labels. Second, we augmented the data to support model robustness and generalizability to unseen data sources. Third, we applied an evaluation metric that captured the local serological testing capacity for Chagas disease to frame the machine learning problem as a triage task. Significance: Over 630 participants from 111 teams submitted over 1300 entries during the Challenge, representing diverse approaches from academia and industry worldwide.

Detection of Chagas Disease from the ECG: The George B. Moody PhysioNet Challenge 2025

TL;DR

The paper presents the George B. Moody PhysioNet Challenge 2025, which tasks teams with developing open-source algorithms to identify Chagas disease from standard 12-lead ECGs under a real-world testing-capacity constraint. It leverages a large, multi-source dataset with both weak (self-reported) and strong (serologically validated) labels and introduces data augmentation and WFDB-format preprocessing to enhance generalization. A novel constrained ranking metric evaluates true positives within the top referrals, reflecting limited serological testing capacity in endemic regions and guiding practical triage performance. Results show substantial participation and promising triage capabilities, but reveal notable generalization gaps across unseen cohorts, underscoring challenges in deploying ECG-based screening for Chagas disease at scale.

Abstract

Objective: Chagas disease is a parasitic infection that is endemic to South America, Central America, and, more recently, the U.S., primarily transmitted by insects. Chronic Chagas disease can cause cardiovascular diseases and digestive problems. Serological testing capacities for Chagas disease are limited, but Chagas cardiomyopathy often manifests in ECGs, providing an opportunity to prioritize patients for testing and treatment. Approach: The George B. Moody PhysioNet Challenge 2025 invites teams to develop algorithmic approaches for identifying Chagas disease from electrocardiograms (ECGs). Main results: This Challenge provides multiple innovations. First, we leveraged several datasets with labels from patient reports and serological testing, provided a large dataset with weak labels and smaller datasets with strong labels. Second, we augmented the data to support model robustness and generalizability to unseen data sources. Third, we applied an evaluation metric that captured the local serological testing capacity for Chagas disease to frame the machine learning problem as a triage task. Significance: Over 630 participants from 111 teams submitted over 1300 entries during the Challenge, representing diverse approaches from academia and industry worldwide.

Paper Structure

This paper contains 11 sections, 1 equation, 2 figures, 3 tables.

Figures (2)

  • Figure 1: Illustration of the Challenge evaluation metric in receiver-operating characteristic (ROC) space. The shaded triangle represents the feasible operating region under the fixed testing capacity constraint. The Challenge score corresponds to the true positive rate (TPR) achieved within the top 5% of predicted cases, approximating the real-world serological testing limit. Illustration adapted from Sameni2025roc_geometry.
  • Figure 2: Challenge scores ($x$-axis) on the data sources ($y$-axis) for the hidden validation and test sets. Each point is the score of the method on a different dataset, and each thin solid line connects the scores for each team across datasets; the thick dashed color line shows the change in the median score across the different datasets.