Table of Contents
Fetching ...

Temporally Consistent Mitral Annulus Measurements from Sparse Annotations in Echocardiographic Videos

Gino E. Jansen, Mark J. Schuuring, Berto J. Bouma, Ivana Išgum

TL;DR

The paper tackles automatic localization of mitral annulus landmarks in echocardiography videos under sparse annotations by enforcing temporal consistency and handling missing landmarks. It introduces a self-supervised temporal consistency loss and field-of-view augmentations within aResNet-like fully convolutional landmark detector that jointly performs classification and regression. On CAMUS and a private AUMC dataset, the approach achieves $MAPSE$ MAE of $1.81 \pm 0.14$ mm, $annulus\ size$ MAE of $2.46 \pm 0.31$ mm, landmark location MAE of $2.48 \pm 0.07$ mm, and ROC-AUC for missing landmarks of $0.99$, outperforming the baseline. These improvements translate to smoother landmark trajectories and more accurate $MAPSE$ measurements, with potential clinical impact for risk stratification and improved left-ventricular function assessment, especially when landmarks momentarily leave the field of view.

Abstract

This work presents a novel approach to achieving temporally consistent mitral annulus landmark localization in echocardiography videos using sparse annotations. Our method introduces a self-supervised loss term that enforces temporal consistency between neighboring frames, which smooths the position of landmarks and enhances measurement accuracy over time. Additionally, we incorporate realistic field-of-view augmentations to improve the recognition of missing anatomical landmarks. We evaluate our approach on both a public and private dataset, and demonstrate significant improvements in Mitral Annular Plane Systolic Excursion (MAPSE) calculations and overall landmark tracking stability. The method achieves a mean absolute MAPSE error of 1.81 $\pm$ 0.14 mm, an annulus size error of 2.46 $\pm$ 0.31 mm, and a landmark localization error of 2.48 $\pm$ 0.07 mm. Finally, it achieves a 0.99 ROC-AUC for recognition of missing landmarks.

Temporally Consistent Mitral Annulus Measurements from Sparse Annotations in Echocardiographic Videos

TL;DR

The paper tackles automatic localization of mitral annulus landmarks in echocardiography videos under sparse annotations by enforcing temporal consistency and handling missing landmarks. It introduces a self-supervised temporal consistency loss and field-of-view augmentations within aResNet-like fully convolutional landmark detector that jointly performs classification and regression. On CAMUS and a private AUMC dataset, the approach achieves MAE of mm, MAE of mm, landmark location MAE of mm, and ROC-AUC for missing landmarks of , outperforming the baseline. These improvements translate to smoother landmark trajectories and more accurate measurements, with potential clinical impact for risk stratification and improved left-ventricular function assessment, especially when landmarks momentarily leave the field of view.

Abstract

This work presents a novel approach to achieving temporally consistent mitral annulus landmark localization in echocardiography videos using sparse annotations. Our method introduces a self-supervised loss term that enforces temporal consistency between neighboring frames, which smooths the position of landmarks and enhances measurement accuracy over time. Additionally, we incorporate realistic field-of-view augmentations to improve the recognition of missing anatomical landmarks. We evaluate our approach on both a public and private dataset, and demonstrate significant improvements in Mitral Annular Plane Systolic Excursion (MAPSE) calculations and overall landmark tracking stability. The method achieves a mean absolute MAPSE error of 1.81 0.14 mm, an annulus size error of 2.46 0.31 mm, and a landmark localization error of 2.48 0.07 mm. Finally, it achieves a 0.99 ROC-AUC for recognition of missing landmarks.

Paper Structure

This paper contains 12 sections, 2 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: Architecture of the proposed method. A ResNet-like fully convolutional neural network takes three consecutive frames as input, stacked in the channel dimension, and outputs classification and regression maps for mitral annulus landmarks. The blue convolutional blocks include GroupNorm, while other blocks omit normalization. ReLU is used as the activation function unless stated otherwise. Temporal consistency is enforced through an unsupervised loss that compares predicted displacements across neighboring frames, while supervised losses guide the regression and classification outputs. During test time, landmarks are computed as the weighted mean of regressed locations from each patch, using the respective probabilities from the classification map.
  • Figure 2: Training image and associated targets for the right annulus landmark undergoing field-of-view augmentation. The original image is cropped using a sector-shaped cropping window, which excludes the right landmark due to random cropping. Consequently, the regression target is ignored and excluded from back-propagation, while the classification target is updated to an array of all zeros to reflect the absence of the landmark in the cropped region.
  • Figure 3: (a, b) Correlation plots of experiments comparing the temporally consistent method with the baseline approachNoothout2020.(c) ROC curves for recognition of missing landmarks.
  • Figure 4: Left: Axial position of the left annulus landmark plotted over time for a randomly selected test video, comparing the smoothness of the baseline Noothout2020 and the proposed method. Right: Absolute jerk (third-order derivative) over time, illustrating reduced jerk values for the proposed method. For this visualization, jerk is computed from axial displacement only, while the values reported in Table \ref{['tab:errors']} are based on the 2D position.
  • Figure 5: Annulus excursion over time plotted for three test video sequences, comparing the baseline Noothout2020 with the proposed method (a–c). The plots display the annulus excursion (in mm) per frame, with the predicted end-diastole (ED, downward triangles) and end-systole (ES, upward triangles) indicated. The maximum annulus excursion (MAPSE) is represented by the dashed line for the reference method, while the predicted MAPSE corresponds to the excursion value at ES (excursion at ED is defined as 0). The accompanying ultrasound frames show the predicted annulus (solid line) and the reference annotation (dashed line) for both methods.