Table of Contents
Fetching ...

MIEO: encoding clinical data to enhance cardiovascular event prediction

Davide Borghini, Davide Marchi, Angelo Nardone, Giordano Scerra, Silvia Giulia Galfrè, Alessandro Pingitore, Giuseppe Prencipe, Corrado Priami, Alina Sîrbu

TL;DR

The paper tackles limited labelled data and missing values in cardiovascular event prediction from structured clinical data. It introduces MIEO, a masked-input, self-supervised autoencoder that exploits unlabelled data to learn latent representations while imputing missing values, which are then used by an ANN classifier to predict cardiovascular death within eight years. Empirical results show that MIEO embeddings can match or slightly surpass direct feature-based classification in balanced accuracy, demonstrating the value of latent-space features under data heterogeneity and label scarcity. This work advances the use of self-supervised encoding for robust clinical predictions and points to potential applications in creating deep digital twins from structured patient data.

Abstract

As clinical data are becoming increasingly available, machine learning methods have been employed to extract knowledge from them and predict clinical events. While promising, approaches suffer from at least two main issues: low availability of labelled data and data heterogeneity leading to missing values. This work proposes the use of self-supervised auto-encoders to efficiently address these challenges. We apply our methodology to a clinical dataset from patients with ischaemic heart disease. Patient data is embedded in a latent space, built using unlabelled data, which is then used to train a neural network classifier to predict cardiovascular death. Results show improved balanced accuracy compared to applying the classifier directly to the raw data, demonstrating that this solution is promising, especially in conditions where availability of unlabelled data could increase.

MIEO: encoding clinical data to enhance cardiovascular event prediction

TL;DR

The paper tackles limited labelled data and missing values in cardiovascular event prediction from structured clinical data. It introduces MIEO, a masked-input, self-supervised autoencoder that exploits unlabelled data to learn latent representations while imputing missing values, which are then used by an ANN classifier to predict cardiovascular death within eight years. Empirical results show that MIEO embeddings can match or slightly surpass direct feature-based classification in balanced accuracy, demonstrating the value of latent-space features under data heterogeneity and label scarcity. This work advances the use of self-supervised encoding for robust clinical predictions and points to potential applications in creating deep digital twins from structured patient data.

Abstract

As clinical data are becoming increasingly available, machine learning methods have been employed to extract knowledge from them and predict clinical events. While promising, approaches suffer from at least two main issues: low availability of labelled data and data heterogeneity leading to missing values. This work proposes the use of self-supervised auto-encoders to efficiently address these challenges. We apply our methodology to a clinical dataset from patients with ischaemic heart disease. Patient data is embedded in a latent space, built using unlabelled data, which is then used to train a neural network classifier to predict cardiovascular death. Results show improved balanced accuracy compared to applying the classifier directly to the raw data, demonstrating that this solution is promising, especially in conditions where availability of unlabelled data could increase.

Paper Structure

This paper contains 8 sections, 2 figures, 1 table.

Figures (2)

  • Figure 1: A simplified representations of our models: the MIEO autoencoder, the classifier for the downstream task applied to clinical data directly, the classifier applied to MIEO embeddings.
  • Figure 2: A graphical example of the target value and the output used to calculate the loss in the MIEO model.