Table of Contents
Fetching ...

Multi-Stage Residual-Aware Unsupervised Deep Learning Framework for Consistent Ultrasound Strain Elastography

Shourov Joarder, Tushar Talukder Showrav, Md. Kamrul Hasan

TL;DR

The paper tackles robust, temporally coherent strain estimation in ultrasound strain elastography under decorrelation noise and limited ground truth. It introduces MUSSE-Net, an unsupervised, multi-stage framework built on the USSE-Net backbone with a Context-Aware Complementary Feature Fusion encoder, Tri-Cross Attention bottleneck, and Cross-Attentive Fusion decoder, augmented by a temporal consistency loss. A residual refinement stage further improves accuracy by iteratively estimating displacements across two stages. Extensive experiments on simulated data, open in vivo arm data, and private BUET breast ultrasound data demonstrate state-of-the-art performance in SNR, CNR, and lesion delineation, signaling strong potential for clinical deployment.

Abstract

Ultrasound Strain Elastography (USE) is a powerful non-invasive imaging technique for assessing tissue mechanical properties, offering crucial diagnostic value across diverse clinical applications. However, its clinical application remains limited by tissue decorrelation noise, scarcity of ground truth, and inconsistent strain estimation under different deformation conditions. Overcoming these barriers, we propose MUSSE-Net, a residual-aware, multi-stage unsupervised sequential deep learning framework designed for robust and consistent strain estimation. At its backbone lies our proposed USSE-Net, an end-to-end multi-stream encoder-decoder architecture that parallelly processes pre- and post-deformation RF sequences to estimate displacement fields and axial strains. The novel architecture incorporates Context-Aware Complementary Feature Fusion (CACFF)-based encoder with Tri-Cross Attention (TCA) bottleneck with a Cross-Attentive Fusion (CAF)-based sequential decoder. To ensure temporal coherence and strain stability across varying deformation levels, this architecture leverages a tailored consistency loss. Finally, with the MUSSE-Net framework, a secondary residual refinement stage further enhances accuracy and suppresses noise. Extensive validation on simulation, in vivo, and private clinical datasets from Bangladesh University of Engineering and Technology (BUET) medical center, demonstrates MUSSE-Net's outperformed existing unsupervised approaches. On MUSSE-Net achieves state-of-the-art performance with a target SNR of 24.54, background SNR of 132.76, CNR of 59.81, and elastographic SNR of 9.73 on simulation data. In particular, on the BUET dataset, MUSSE-Net produces strain maps with enhanced lesion-to-background contrast and significant noise suppression yielding clinically interpretable strain patterns.

Multi-Stage Residual-Aware Unsupervised Deep Learning Framework for Consistent Ultrasound Strain Elastography

TL;DR

The paper tackles robust, temporally coherent strain estimation in ultrasound strain elastography under decorrelation noise and limited ground truth. It introduces MUSSE-Net, an unsupervised, multi-stage framework built on the USSE-Net backbone with a Context-Aware Complementary Feature Fusion encoder, Tri-Cross Attention bottleneck, and Cross-Attentive Fusion decoder, augmented by a temporal consistency loss. A residual refinement stage further improves accuracy by iteratively estimating displacements across two stages. Extensive experiments on simulated data, open in vivo arm data, and private BUET breast ultrasound data demonstrate state-of-the-art performance in SNR, CNR, and lesion delineation, signaling strong potential for clinical deployment.

Abstract

Ultrasound Strain Elastography (USE) is a powerful non-invasive imaging technique for assessing tissue mechanical properties, offering crucial diagnostic value across diverse clinical applications. However, its clinical application remains limited by tissue decorrelation noise, scarcity of ground truth, and inconsistent strain estimation under different deformation conditions. Overcoming these barriers, we propose MUSSE-Net, a residual-aware, multi-stage unsupervised sequential deep learning framework designed for robust and consistent strain estimation. At its backbone lies our proposed USSE-Net, an end-to-end multi-stream encoder-decoder architecture that parallelly processes pre- and post-deformation RF sequences to estimate displacement fields and axial strains. The novel architecture incorporates Context-Aware Complementary Feature Fusion (CACFF)-based encoder with Tri-Cross Attention (TCA) bottleneck with a Cross-Attentive Fusion (CAF)-based sequential decoder. To ensure temporal coherence and strain stability across varying deformation levels, this architecture leverages a tailored consistency loss. Finally, with the MUSSE-Net framework, a secondary residual refinement stage further enhances accuracy and suppresses noise. Extensive validation on simulation, in vivo, and private clinical datasets from Bangladesh University of Engineering and Technology (BUET) medical center, demonstrates MUSSE-Net's outperformed existing unsupervised approaches. On MUSSE-Net achieves state-of-the-art performance with a target SNR of 24.54, background SNR of 132.76, CNR of 59.81, and elastographic SNR of 9.73 on simulation data. In particular, on the BUET dataset, MUSSE-Net produces strain maps with enhanced lesion-to-background contrast and significant noise suppression yielding clinically interpretable strain patterns.

Paper Structure

This paper contains 22 sections, 7 equations, 9 figures, 2 tables.

Figures (9)

  • Figure 1: Architectural schematic of the proposed USSE-Net, which serves as the backbone of the MUSSE-Net framework.
  • Figure 2: Block diagram of the proposed multi stage residual-aware framework, MUSSE-Net.
  • Figure 3: Qualitative comparison between the proposed framework and existing methods. From left to right, each column represents strain maps at varying strain levels.
  • Figure 4: Analysis of average SNR (target and background) and CNR metrics at different stains. The leftmost shows $SNR_t$ vs strain, the middle one shows $SNR_{bg}$ vs strain, and the rightmost one depicts $CNR$ vs strain performance.
  • Figure 5: Qualitative strain results obtained from in vivo test data across a sequence of 5 ultrasound post-deformation RF frames. Each column shows the estimated strain map at that temporal frame by ReUSENet and out proposed USSE-Net and MUSSE-Net.
  • ...and 4 more figures