Table of Contents
Fetching ...

Electrocardiogram-based diagnosis of liver diseases: an externally validated and explainable machine learning approach

Juan Miguel Lopez Alcaraz, Wilhelm Haverkamp, Nils Strodthoff

TL;DR

This study tackles noninvasive liver-disease detection using ECG signals with external validation and explainable AI. It employs tree-based models ($XGBoost$) on ECG features plus demographics to predict six ICD-10-CM liver-disease codes, with Shapley-value explanations to reveal clinically plausible biomarkers such as age and QTc. External AUROCs reach up to $0.8777$ for certain alcohol-related conditions, demonstrating strong cross-cohort generalization and interpretability through feature attributions. The work supports ECG-based screening as a cost-effective tool for early detection and risk stratification, particularly in resource-limited settings, and identifies robust biomarkers that bridge cardiovascular and hepatic pathophysiology.

Abstract

Background: Liver diseases present a significant global health challenge and often require costly, invasive diagnostics. Electrocardiography (ECG), a widely available and non-invasive tool, can enable the detection of liver disease by capturing cardiovascular-hepatic interactions. Methods: We trained tree-based machine learning models on ECG features to detect liver diseases using two large datasets: MIMIC-IV-ECG (467,729 patients, 2008-2019) and ECG-View II (775,535 patients, 1994-2013). The task was framed as binary classification, with performance evaluated via the area under the receiver operating characteristic curve (AUROC). To improve interpretability, we applied explainability methods to identify key predictive features. Findings: The models showed strong predictive performance with good generalizability. For example, AUROCs for alcoholic liver disease (K70) were 0.8025 (95% confidence interval (CI), 0.8020-0.8035) internally and 0.7644 (95% CI, 0.7641-0.7649) externally; for hepatic failure (K72), scores were 0.7404 (95% CI, 0.7389-0.7415) and 0.7498 (95% CI, 0.7494-0.7509), respectively. The explainability analysis consistently identified age and prolonged QTc intervals (corrected QT, reflecting ventricular repolarization) as key predictors. Features linked to autonomic regulation and electrical conduction abnormalities were also prominent, supporting known cardiovascular-liver connections and suggesting QTc as a potential biomarker. Interpretation: ECG-based machine learning offers a promising, interpretable approach for liver disease detection, particularly in resource-limited settings. By revealing clinically relevant biomarkers, this method supports non-invasive diagnostics, early detection, and risk stratification prior to targeted clinical assessments.

Electrocardiogram-based diagnosis of liver diseases: an externally validated and explainable machine learning approach

TL;DR

This study tackles noninvasive liver-disease detection using ECG signals with external validation and explainable AI. It employs tree-based models () on ECG features plus demographics to predict six ICD-10-CM liver-disease codes, with Shapley-value explanations to reveal clinically plausible biomarkers such as age and QTc. External AUROCs reach up to for certain alcohol-related conditions, demonstrating strong cross-cohort generalization and interpretability through feature attributions. The work supports ECG-based screening as a cost-effective tool for early detection and risk stratification, particularly in resource-limited settings, and identifies robust biomarkers that bridge cardiovascular and hepatic pathophysiology.

Abstract

Background: Liver diseases present a significant global health challenge and often require costly, invasive diagnostics. Electrocardiography (ECG), a widely available and non-invasive tool, can enable the detection of liver disease by capturing cardiovascular-hepatic interactions. Methods: We trained tree-based machine learning models on ECG features to detect liver diseases using two large datasets: MIMIC-IV-ECG (467,729 patients, 2008-2019) and ECG-View II (775,535 patients, 1994-2013). The task was framed as binary classification, with performance evaluated via the area under the receiver operating characteristic curve (AUROC). To improve interpretability, we applied explainability methods to identify key predictive features. Findings: The models showed strong predictive performance with good generalizability. For example, AUROCs for alcoholic liver disease (K70) were 0.8025 (95% confidence interval (CI), 0.8020-0.8035) internally and 0.7644 (95% CI, 0.7641-0.7649) externally; for hepatic failure (K72), scores were 0.7404 (95% CI, 0.7389-0.7415) and 0.7498 (95% CI, 0.7494-0.7509), respectively. The explainability analysis consistently identified age and prolonged QTc intervals (corrected QT, reflecting ventricular repolarization) as key predictors. Features linked to autonomic regulation and electrical conduction abnormalities were also prominent, supporting known cardiovascular-liver connections and suggesting QTc as a potential biomarker. Interpretation: ECG-based machine learning offers a promising, interpretable approach for liver disease detection, particularly in resource-limited settings. By revealing clinically relevant biomarkers, this method supports non-invasive diagnostics, early detection, and risk stratification prior to targeted clinical assessments.

Paper Structure

This paper contains 12 sections, 2 figures, 5 tables.

Figures (2)

  • Figure 1: Ilustrive representation of our proposed approach. We use as internal dataset the MIMIC-IV-ECG dataset from which we use as input features demographics and ECG features to train a tree-based model and predict diverse liver diseases. For external validation we take a second cohort of patients from the ECG-View II dataet from which we collect the same ECG features and liver diseases. We define liver diseases by means of ICD10-CM codes for a well define disease representation.
  • Figure 2: Explainability results for the six investigated conditions. The beeswarm plot indicates for every sample and every feature if the corresponding feature contributes positively (right hand side) or negatively (left hand side) to the model prediction. An additional color-coding indicates if a data point is associated with high (red) or low (blue) feature values.