Table of Contents
Fetching ...

Deep Spatiotemporal Clutter Filtering of Transthoracic Echocardiographic Images: Leveraging Contextual Attention and Residual Learning

Mahdi Tabassian, Somayeh Akbari, Sandro Queirós, Jan D'hooge

TL;DR

The paper tackles reverberation clutter in transthoracic echocardiography by introducing a 3D convolutional autoencoder with contextual attention gates and residual learning to filter clutter across the cardiac cycle. It trains on a large synthetic dataset with artifacts simulated for six ultrasound vendors and demonstrates strong generalization to unseen synthetic and in vivo data, including strain analysis. Quantitative metrics show superior reconstruction quality and spatiotemporal coherence, while qualitative and in vivo assessments confirm practical improvements and real-time processing capabilities. Overall, the method offers a robust, real-time clutter filtering solution with potential to enhance downstream clinical indices and workflow integration.

Abstract

This study presents a deep convolutional autoencoder network for filtering reverberation clutter from transthoracic echocardiographic (TTE) image sequences. Given the spatiotemporal nature of this type of clutter, the filtering network employs 3D convolutional layers to suppress it throughout the cardiac cycle. The design of the network incorporates two key features that contribute to the effectiveness of the filter: 1) an attention mechanism for focusing on cluttered regions and leveraging contextual information, and 2) residual learning for preserving fine image structures. To train the network, a diverse set of artifact patterns was simulated and superimposed onto ultra-realistic synthetic TTE sequences from six ultrasound vendors, generating input for the filtering network. The artifact-free sequences served as ground-truth. Performance of the filtering network was evaluated using unseen synthetic and in vivo artifactual sequences. Results from the in vivo dataset confirmed the network's strong generalization capabilities, despite being trained solely on synthetic data and simulated artifacts. The suitability of the filtered sequences for downstream processing was assessed by computing segmental strain curves. A significant reduction in the discrepancy between strain profiles computed from cluttered and clutter-free segments was observed after filtering the cluttered sequences with the proposed network. The trained network processes a TTE sequence in a fraction of a second, enabling real-time clutter filtering and potentially improving the precision of clinically relevant indices derived from TTE sequences. The source code of the proposed method and example video files of the filtering results are available at: https://github.com/MahdiTabassian/Deep-ClutterFiltering/tree/main.

Deep Spatiotemporal Clutter Filtering of Transthoracic Echocardiographic Images: Leveraging Contextual Attention and Residual Learning

TL;DR

The paper tackles reverberation clutter in transthoracic echocardiography by introducing a 3D convolutional autoencoder with contextual attention gates and residual learning to filter clutter across the cardiac cycle. It trains on a large synthetic dataset with artifacts simulated for six ultrasound vendors and demonstrates strong generalization to unseen synthetic and in vivo data, including strain analysis. Quantitative metrics show superior reconstruction quality and spatiotemporal coherence, while qualitative and in vivo assessments confirm practical improvements and real-time processing capabilities. Overall, the method offers a robust, real-time clutter filtering solution with potential to enhance downstream clinical indices and workflow integration.

Abstract

This study presents a deep convolutional autoencoder network for filtering reverberation clutter from transthoracic echocardiographic (TTE) image sequences. Given the spatiotemporal nature of this type of clutter, the filtering network employs 3D convolutional layers to suppress it throughout the cardiac cycle. The design of the network incorporates two key features that contribute to the effectiveness of the filter: 1) an attention mechanism for focusing on cluttered regions and leveraging contextual information, and 2) residual learning for preserving fine image structures. To train the network, a diverse set of artifact patterns was simulated and superimposed onto ultra-realistic synthetic TTE sequences from six ultrasound vendors, generating input for the filtering network. The artifact-free sequences served as ground-truth. Performance of the filtering network was evaluated using unseen synthetic and in vivo artifactual sequences. Results from the in vivo dataset confirmed the network's strong generalization capabilities, despite being trained solely on synthetic data and simulated artifacts. The suitability of the filtered sequences for downstream processing was assessed by computing segmental strain curves. A significant reduction in the discrepancy between strain profiles computed from cluttered and clutter-free segments was observed after filtering the cluttered sequences with the proposed network. The trained network processes a TTE sequence in a fraction of a second, enabling real-time clutter filtering and potentially improving the precision of clinically relevant indices derived from TTE sequences. The source code of the proposed method and example video files of the filtering results are available at: https://github.com/MahdiTabassian/Deep-ClutterFiltering/tree/main.
Paper Structure (26 sections, 11 equations, 17 figures, 4 tables)

This paper contains 26 sections, 11 equations, 17 figures, 4 tables.

Figures (17)

  • Figure 1: Examples of the ultra-realistic synthetic images of six ultrasound vendors (alessandrini2017realistic).
  • Figure 2: Schematic representation of the reverberation clutter pattern simulation. The grayscale value of each pixel within a rectangular region of interest is determined by its position relative to the means of two independent univariate Gaussian distributions. The rectangle's dimensions extend 3$\sigma$ in both the horizontal and vertical directions. The central pixel $i$, located at the intersection of the means, exhibits the highest grayscale value. Pixels closer to the rectangle's corners have lower grayscale values due to their lower probability densities from the distributions.
  • Figure 3: Schematic representation of artifactual B-mode image generation using the simulated (a) near-field (NF) and (b) ribs- and/or lung-induced (RL) clutter patterns. The simulated patterns were added to the artifact-free images and the clutter pixels located outside the sectorial field-of-view were pruned by setting them to zero. The center of each RL clutter pattern was positioned in either the right or left sector, each with an opening angle of $a = 35^\circ$. This ensures proximity of the simulated patterns to the sector edges of the B-mode image.
  • Figure 4: Architecture of the proposed spatiotemporal clutter filtering network. This fully convolutional autoencoder, based on the 3D U-Net, is designed to generate filtered TTE sequences that are coherent in both space and time. An input-output skip connection was incorporated to preserve fine image structures, while attention gate (AG) modules enable the network to focus on clutter zones and leverage contextual information for efficient image reconstruction. The size of the max-pooling window was set to ($2\times2\times1$) to preserve the original temporal dimension (i.e., the number of frames) of the input TTE sequences at all levels of the encoding path.
  • Figure 5: Internal architecture of the additive attention gate (AG) module. The salient regions on the feature maps at scale $l$, ($x^{l}$), are highlighted by leveraging the information encoded in the coarse feature maps of the subsequent scale ($g$).
  • ...and 12 more figures