Table of Contents
Fetching ...

Improving Risk Stratification in Hypertrophic Cardiomyopathy: A Novel Score Combining Echocardiography, Clinical, and Medication Data

Marion Taconné, Valentina D. A. Corino, Annamaria Del Franco, Sara Giovani, Iacopo Olivotto, Adrien Al Wazzan, Erwan Donal, Pietro Cerveri, Luca Mainardi

Abstract

Hypertrophic cardiomyopathy (HCM) requires accurate risk stratification to inform decisions regarding ICD therapy and follow-up management. Current established models, such as the European Society of Cardiology (ESC) score, exhibit moderate discriminative performance. This study develops a robust, explainable machine learning (ML) risk score leveraging routinely collected echocardiographic, clinical, and medication data, typically contained within Electronic Health Records (EHRs), to predict a 5-year composite cardiovascular outcome in HCM patients. The model was trained and internally validated using a large cohort (N=1,201) from the SHARE registry (Florence Hospital) and externally validated on an independent cohort (N=382) from Rennes Hospital. The final Random Forest ensemble model achieved a high internal Area Under the Curve (AUC) of 0.85 +- 0.02, significantly outperforming the ESC score (0.56 +- 0.03). Critically, survival curve analysis on the external validation set showed superior risk separation for the ML score (Log-rank p = 8.62 x 10^(-4) compared to the ESC score (p = 0.0559). Furthermore, longitudinal analyses demonstrate that the proposed risk score remains stable over time in event-free patients. The model high interpretability and its capacity for longitudinal risk monitoring represent promising tools for the personalized clinical management of HCM.

Improving Risk Stratification in Hypertrophic Cardiomyopathy: A Novel Score Combining Echocardiography, Clinical, and Medication Data

Abstract

Hypertrophic cardiomyopathy (HCM) requires accurate risk stratification to inform decisions regarding ICD therapy and follow-up management. Current established models, such as the European Society of Cardiology (ESC) score, exhibit moderate discriminative performance. This study develops a robust, explainable machine learning (ML) risk score leveraging routinely collected echocardiographic, clinical, and medication data, typically contained within Electronic Health Records (EHRs), to predict a 5-year composite cardiovascular outcome in HCM patients. The model was trained and internally validated using a large cohort (N=1,201) from the SHARE registry (Florence Hospital) and externally validated on an independent cohort (N=382) from Rennes Hospital. The final Random Forest ensemble model achieved a high internal Area Under the Curve (AUC) of 0.85 +- 0.02, significantly outperforming the ESC score (0.56 +- 0.03). Critically, survival curve analysis on the external validation set showed superior risk separation for the ML score (Log-rank p = 8.62 x 10^(-4) compared to the ESC score (p = 0.0559). Furthermore, longitudinal analyses demonstrate that the proposed risk score remains stable over time in event-free patients. The model high interpretability and its capacity for longitudinal risk monitoring represent promising tools for the personalized clinical management of HCM.

Paper Structure

This paper contains 14 sections, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Methodological steps separated in training-test (top) and validation phase (bottom) with the survival analysis and longitudinal analysis. The five training models are gathered to form a final ensemble model.
  • Figure 2: Mean ROC curves of the nested cross-validation of the LR, GB, RF and SVM models compared against the ESC risk score.
  • Figure 3: Average SHAP summary plot of the global feature importance and directional impact for the five folds of the RF model. Echocardiographic features are written in blue. NYHA: New-York Heart Association index, LVESV or LVESDV: Left Ventricle End Systolic/Diastolic Volume, LVEF: Left Ventricle Ejection Fraction, PWT: Posterior Wall Thickness, sep E/E': ratio of the E wave with septal E', E':peak E' velocity, LA diameter: Left Atrium diameter, LVOT, LV Outflow Tract Obstruction, LVIDd/s: LV Internal Dimension at end-diastole/systole, MWT: Maximum Wall Thickness, lat E': lateral E', dias/sys BP: diastolic/systolic Blood Pressure, FH HCM: family history of HCM, NSVT: non sustained ventricular tachycardia, RAAS : Renin-angiotensin-aldosterone system medication.
  • Figure 4: Longitudinal analysis visualization: the model's prediction value (blue-to-red scatters) at each exam and the linear regression colored by their slope value (red-to-green).Only patients of the first test set fold who experienced endpoint with at least 5 exams are plotted (P1-P45). Four patients (P4, P17, P32 and P41) were zoomed to show the dynamic of the slopes.
  • Figure 5: Distribution of RF ensemble mean prediction on the external validation population (Rennes) separated depending on the true label (event 0/1), compared with the ESC score (right). Kernel Density Estimations (KDE) were plotted over both histograms.
  • ...and 1 more figures