Table of Contents
Fetching ...

Heart Failure Prediction using Modal Decomposition and Masked Autoencoders for Scarce Echocardiography Databases

Andrés Bell-Navas, María Villalba-Orero, Enrique Lara-Pezzi, Jesús Garicano-Mena, Soledad Le Clainche

TL;DR

The paper tackles predicting the exact time-to-heart-failure from echocardiography when annotated data are scarce. It proposes a two-stage framework that fuses Modal Decomposition (SVD/HODMD) for data generation and feature extraction with Masked Autoencoder–based self-supervised and supervised training of a Vision Transformer. An enlarged, homogenized echocardiography database is created, and extensive experiments show the approach outperforms baselines with real-time inference capabilities. The work also provides open-source code within the ModelFLOWs-app to facilitate practical deployment and further research.

Abstract

Heart diseases remain the leading cause of mortality worldwide, implying approximately 18 million deaths according to the WHO. In particular, heart failures (HF) press the healthcare industry to develop systems for their early, rapid, and effective prediction. This work presents an automatic system based on a novel framework which combines Modal Decomposition and Masked Autoencoders (MAE) to extend the application from heart disease classification to the more challenging and specific task of heart failure time prediction, not previously addressed to the best of authors' knowledge. This system comprises two stages. The first one transforms the data from a database of echocardiography video sequences into a large collection of annotated images compatible with the training phase of machine learning-based frameworks and deep learning-based ones. This stage includes the use of the Higher Order Dynamic Mode Decomposition (HODMD) algorithm for both data augmentation and feature extraction. The second stage builds and trains a Vision Transformer (ViT). MAEs based on a combined scheme of self-supervised (SSL) and supervised learning, so far barely explored in the literature about heart failure prediction, are adopted to effectively train the ViT from scratch, even with scarce databases. The designed neural network analyses in real-time images from echocardiography sequences to estimate the time of happening a heart failure. This approach demonstrates to improve prediction accuracy from scarce databases and to be superior to several established ViT and Convolutional Neural Network (CNN) architectures. The source code will be incorporated into the next version release of the ModelFLOWs-app software (https://github.com/modelflows/ModelFLOWs-app).

Heart Failure Prediction using Modal Decomposition and Masked Autoencoders for Scarce Echocardiography Databases

TL;DR

The paper tackles predicting the exact time-to-heart-failure from echocardiography when annotated data are scarce. It proposes a two-stage framework that fuses Modal Decomposition (SVD/HODMD) for data generation and feature extraction with Masked Autoencoder–based self-supervised and supervised training of a Vision Transformer. An enlarged, homogenized echocardiography database is created, and extensive experiments show the approach outperforms baselines with real-time inference capabilities. The work also provides open-source code within the ModelFLOWs-app to facilitate practical deployment and further research.

Abstract

Heart diseases remain the leading cause of mortality worldwide, implying approximately 18 million deaths according to the WHO. In particular, heart failures (HF) press the healthcare industry to develop systems for their early, rapid, and effective prediction. This work presents an automatic system based on a novel framework which combines Modal Decomposition and Masked Autoencoders (MAE) to extend the application from heart disease classification to the more challenging and specific task of heart failure time prediction, not previously addressed to the best of authors' knowledge. This system comprises two stages. The first one transforms the data from a database of echocardiography video sequences into a large collection of annotated images compatible with the training phase of machine learning-based frameworks and deep learning-based ones. This stage includes the use of the Higher Order Dynamic Mode Decomposition (HODMD) algorithm for both data augmentation and feature extraction. The second stage builds and trains a Vision Transformer (ViT). MAEs based on a combined scheme of self-supervised (SSL) and supervised learning, so far barely explored in the literature about heart failure prediction, are adopted to effectively train the ViT from scratch, even with scarce databases. The designed neural network analyses in real-time images from echocardiography sequences to estimate the time of happening a heart failure. This approach demonstrates to improve prediction accuracy from scarce databases and to be superior to several established ViT and Convolutional Neural Network (CNN) architectures. The source code will be incorporated into the next version release of the ModelFLOWs-app software (https://github.com/modelflows/ModelFLOWs-app).

Paper Structure

This paper contains 10 sections, 5 equations, 7 figures, 4 tables.

Figures (7)

  • Figure 1: Overall structure of the proposed heart failure prediction system, including representations of echocardiography images with non-heart regions (i.e., the electrocardiogram, medical information, and the black background).
  • Figure 2: Block diagram of the Cardiac Database Creation stage, composed of two phases: (1) Data Homogenization, and (2) Modal Decomposition-based Data Generation.
  • Figure 3: Block diagram of the HODMD algorithm applied on a video sequence of echocardiography images.
  • Figure 4: Block diagram of the Heart Failure Prediction stage, composed of four phases: (1) Data Homogenization, (2) Modal Decomposition-based Data Transform, (3) Deep Neural Network-based Heart Failure Prediction, and (4) Fusion of Heart Failure Predictions.
  • Figure 5: Architecture of the proposed deep neural network with the joint self-supervised (SSL) and supervised learning scheme. (a) The input: an image from an echocardiography video sequence, in the form of an original echocardiography sample, a mode, or a reconstruction obtained with the SVD and the HODMD algorithms, result from the Modal Decomposition-based Data Transform phase. (b) The Self-supervised Auxiliary Task (SSAT), aimed to reconstruct the missing patches from the masked image, aiding in training the ViT for the Regression Task. (c) The Regression Task, i.e., the heart failure time prediction task. (d) The predicted heart failure time of the input image.
  • ...and 2 more figures