Table of Contents
Fetching ...

EchoNarrator: Generating natural text explanations for ejection fraction predictions

Sarina Thomas, Qing Cao, Anna Novikova, Daria Kulikova, Guy Ben-Yosef

TL;DR

A model that in a single forward pass combines estimation of the LV contour over multiple frames, together with a set of modules and routines for computing various motion and shape attributes that are associated with ejection fraction, and feeds the attributes into a large language model to generate text that helps to explain the network's outcome in a human-like manner is proposed.

Abstract

Ejection fraction (EF) of the left ventricle (LV) is considered as one of the most important measurements for diagnosing acute heart failure and can be estimated during cardiac ultrasound acquisition. While recent successes in deep learning research successfully estimate EF values, the proposed models often lack an explanation for the prediction. However, providing clear and intuitive explanations for clinical measurement predictions would increase the trust of cardiologists in these models. In this paper, we explore predicting EF measurements with Natural Language Explanation (NLE). We propose a model that in a single forward pass combines estimation of the LV contour over multiple frames, together with a set of modules and routines for computing various motion and shape attributes that are associated with ejection fraction. It then feeds the attributes into a large language model to generate text that helps to explain the network's outcome in a human-like manner. We provide experimental evaluation of our explanatory output, as well as EF prediction, and show that our model can provide EF comparable to state-of-the-art together with meaningful and accurate natural language explanation to the prediction. The project page can be found at https://github.com/guybenyosef/EchoNarrator .

EchoNarrator: Generating natural text explanations for ejection fraction predictions

TL;DR

A model that in a single forward pass combines estimation of the LV contour over multiple frames, together with a set of modules and routines for computing various motion and shape attributes that are associated with ejection fraction, and feeds the attributes into a large language model to generate text that helps to explain the network's outcome in a human-like manner is proposed.

Abstract

Ejection fraction (EF) of the left ventricle (LV) is considered as one of the most important measurements for diagnosing acute heart failure and can be estimated during cardiac ultrasound acquisition. While recent successes in deep learning research successfully estimate EF values, the proposed models often lack an explanation for the prediction. However, providing clear and intuitive explanations for clinical measurement predictions would increase the trust of cardiologists in these models. In this paper, we explore predicting EF measurements with Natural Language Explanation (NLE). We propose a model that in a single forward pass combines estimation of the LV contour over multiple frames, together with a set of modules and routines for computing various motion and shape attributes that are associated with ejection fraction. It then feeds the attributes into a large language model to generate text that helps to explain the network's outcome in a human-like manner. We provide experimental evaluation of our explanatory output, as well as EF prediction, and show that our model can provide EF comparable to state-of-the-art together with meaningful and accurate natural language explanation to the prediction. The project page can be found at https://github.com/guybenyosef/EchoNarrator .

Paper Structure

This paper contains 13 sections, 4 equations, 2 figures, 2 tables.

Figures (2)

  • Figure 1: Overview of the proposed pipeline A US video is fed into a CNN video encoder that outputs a feature representation. The features are passed to a spatio-temporal GCN that returns keypoints for ED and ES. The keypoints serve as 1) input for MLPs that regress the LV volumes (in green) or the EF directly (in orange) and 2) computation of geometrical attributes that are converted into text snippets that can be parsed into an LLM. The LLM provides a human-like explanation for the EF.
  • Figure 2: (Zoom in for optimal view) LV contour estimation, EF prediction and its text explanation as provided by the NLE-EF-13B self-instruct on EchoNet test examples.