Predicting Cardiopulmonary Exercise Testing Outcomes in Congenital Heart Disease Through Multi-modal Data Integration and Geometric Learning
Muhammet Alkan, Gruschen Veldtman, Fani Deligianni
TL;DR
This work tackles risk assessment in congenital heart disease by predicting CPET-derived biomarkers using a novel multi-modal framework that fuses raw ECG signals with NLP-derived clinical letters. It employs covariance-based representations of ECGs projected to a tangent (Riemannian) space and introduces covariance augmentation to address small, imbalanced CHD datasets. The key finding is that combining ECG-derived covariance features with clinical-letter information—and augmenting them in Riemannian space—substantially improves prediction of CPET biomarkers such as $VO_2$ peak and $VO_2$ (%pred), as well as classification metrics for ventilatory efficiency markers like $VE/VCO_2$, compared to baselines using vendor-derived features or single modalities. This approach demonstrates a promising path toward dynamic, noninvasive mortality-surrogate risk stratification in CHD and highlights the value of non-Euclidean data representations in biomedical multi-modal learning.
Abstract
Cardiopulmonary exercise testing (CPET) provides a comprehensive assessment of functional capacity by measuring key physiological variables including oxygen consumption ($VO_2$), carbon dioxide production ($VCO_2$), and pulmonary ventilation ($VE$) during exercise. Previous research has established that parameters such as peak $VO_2$ and $VE/VCO_2$ ratio serve as robust predictors of mortality risk in chronic heart failure patients. In this study, we leverage CPET variables as surrogate mortality endpoints for patients with Congenital Heart Disease (CHD). To our knowledge, this represents the first successful implementation of an advanced machine learning approach that predicts CPET outcomes by integrating electrocardiograms (ECGs) with information derived from clinical letters. Our methodology began with extracting unstructured patient information-including intervention history, diagnoses, and medication regimens-from clinical letters using natural language processing techniques, organizing this data into a structured database. We then digitized ECGs to obtain quantifiable waveforms and established comprehensive data linkages. The core innovation of our approach lies in exploiting the Riemannian geometric properties of covariance matrices derived from both 12-lead ECGs and clinical text data to develop robust regression and classification models. Through extensive ablation studies, we demonstrated that the integration of ECG signals with clinical documentation, enhanced by covariance augmentation techniques in Riemannian space, consistently produced superior predictive performance compared to conventional approaches.
