Table of Contents
Fetching ...

Prior-guided Hierarchical Instance-pixel Contrastive Learning for Ultrasound Speckle Noise Suppression

Zhenyu Bu, Yuanxin Xie, Guang-Quan Zhou

TL;DR

Ultrasound speckle noise degrades image quality and hampers diagnosis. The authors develop a prior-guided hierarchical instance-pixel contrastive learning framework that combines pixel-level SPCL with memory-augmented instance-level MICL and a hybrid Transformer-CNN denoiser to suppress noise while preserving fine anatomical detail. The method leverages local statistics and a memory bank to promote noise-invariant, structure-aware representations and shows consistent improvements over state-of-the-art baselines on BUSI and CAMUS across noise levels. This approach offers a scalable, context-aware denoising strategy with potential for improved clinical interpretation and downstream tasks.

Abstract

Ultrasound denoising is essential for mitigating speckle-induced degradations, thereby enhancing image quality and improving diagnostic reliability. Nevertheless, because speckle patterns inherently encode both texture and fine anatomical details, effectively suppressing noise while preserving structural fidelity remains a significant challenge. In this study, we propose a prior-guided hierarchical instance-pixel contrastive learning model for ultrasound denoising, designed to promote noise-invariant and structure-aware feature representations by maximizing the separability between noisy and clean samples at both pixel and instance levels. Specifically, a statistics-guided pixel-level contrastive learning strategy is introduced to enhance distributional discrepancies between noisy and clean pixels, thereby improving local structural consistency. Concurrently, a memory bank is employed to facilitate instance-level contrastive learning in the feature space, encouraging representations that more faithfully approximate the underlying data distribution. Furthermore, a hybrid Transformer-CNN architecture is adopted, coupling a Transformer-based encoder for global context modeling with a CNN-based decoder optimized for fine-grained anatomical structure restoration, thus enabling complementary exploitation of long-range dependencies and local texture details. Extensive evaluations on two publicly available ultrasound datasets demonstrate that the proposed model consistently outperforms existing methods, confirming its effectiveness and superiority.

Prior-guided Hierarchical Instance-pixel Contrastive Learning for Ultrasound Speckle Noise Suppression

TL;DR

Ultrasound speckle noise degrades image quality and hampers diagnosis. The authors develop a prior-guided hierarchical instance-pixel contrastive learning framework that combines pixel-level SPCL with memory-augmented instance-level MICL and a hybrid Transformer-CNN denoiser to suppress noise while preserving fine anatomical detail. The method leverages local statistics and a memory bank to promote noise-invariant, structure-aware representations and shows consistent improvements over state-of-the-art baselines on BUSI and CAMUS across noise levels. This approach offers a scalable, context-aware denoising strategy with potential for improved clinical interpretation and downstream tasks.

Abstract

Ultrasound denoising is essential for mitigating speckle-induced degradations, thereby enhancing image quality and improving diagnostic reliability. Nevertheless, because speckle patterns inherently encode both texture and fine anatomical details, effectively suppressing noise while preserving structural fidelity remains a significant challenge. In this study, we propose a prior-guided hierarchical instance-pixel contrastive learning model for ultrasound denoising, designed to promote noise-invariant and structure-aware feature representations by maximizing the separability between noisy and clean samples at both pixel and instance levels. Specifically, a statistics-guided pixel-level contrastive learning strategy is introduced to enhance distributional discrepancies between noisy and clean pixels, thereby improving local structural consistency. Concurrently, a memory bank is employed to facilitate instance-level contrastive learning in the feature space, encouraging representations that more faithfully approximate the underlying data distribution. Furthermore, a hybrid Transformer-CNN architecture is adopted, coupling a Transformer-based encoder for global context modeling with a CNN-based decoder optimized for fine-grained anatomical structure restoration, thus enabling complementary exploitation of long-range dependencies and local texture details. Extensive evaluations on two publicly available ultrasound datasets demonstrate that the proposed model consistently outperforms existing methods, confirming its effectiveness and superiority.
Paper Structure (21 sections, 20 equations, 2 figures, 7 tables)

This paper contains 21 sections, 20 equations, 2 figures, 7 tables.

Figures (2)

  • Figure 1: Overall pipeline of the proposed prior-guided hierarchical instance–pixel contrastive learning framework for ultrasound speckle noise suppression. The noisy ultrasound image is encoded by a hybrid Transformer–CNN pipeline, where the Transformer-based encoder captures global contextual features and the CNN-based decoder focuses on fine-grained anatomical structure restoration. Prior-guided hierarchical instance-pixel contrastive learning module in the middle demonstrates that jointly performing pixel and instance-level contrastive learning to encourage noise-invariant and structure-aware representation.
  • Figure 2: Qualitative comparison of denoising results on representative ultrasound images (CAMUS and BUSI). The top row in each subplot displays the clean reference, noisy input, and outputs from classical and deep learning-based baselines (BM3D, UNet and RED-CNN). The bottom row shows results obtained under DnCNN, SwinIR, Uformer, Restormer, and our proposed method. Red boxes indicate regions of interest for detailed visual inspection.