Table of Contents
Fetching ...

Contour-Guided Query-Based Feature Fusion for Boundary-Aware and Generalizable Cardiac Ultrasound Segmentation

Zahid Ullah, Sieun Choi, Jihie Kim

Abstract

Accurate cardiac ultrasound segmentation is essential for reliable assessment of ventricular function in intelligent healthcare systems. However, echocardiographic images are challenging due to low contrast, speckle noise, irregular boundaries, and domain shifts across devices and patient populations. Existing methods, largely based on appearance-driven learning, often fail to preserve boundary precision and structural consistency under these conditions. To address these issues, we propose a Contour-Guided Query Refinement Network (CGQR-Net) for boundary-aware cardiac ultrasound segmentation. The framework integrates multi-resolution feature representations with contour-derived structural priors. An HRNet backbone preserves high-resolution spatial details while capturing multi-scale context. A coarse segmentation is first generated, from which anatomical contours are extracted and encoded into learnable query embeddings. These contour-guided queries interact with fused feature maps via cross-attention, enabling structure-aware refinement that improves boundary delineation and reduces noise artifacts. A dual-head supervision strategy jointly optimizes segmentation and boundary prediction to enforce structural consistency. The proposed method is evaluated on the CAMUS dataset and further validated on the CardiacNet dataset to assess cross-dataset generalization. Experimental results demonstrate improved segmentation accuracy, enhanced boundary precision, and robust performance across varying imaging conditions. These results highlight the effectiveness of integrating contour-level structural information with feature-level representations for reliable cardiac ultrasound segmentation.

Contour-Guided Query-Based Feature Fusion for Boundary-Aware and Generalizable Cardiac Ultrasound Segmentation

Abstract

Accurate cardiac ultrasound segmentation is essential for reliable assessment of ventricular function in intelligent healthcare systems. However, echocardiographic images are challenging due to low contrast, speckle noise, irregular boundaries, and domain shifts across devices and patient populations. Existing methods, largely based on appearance-driven learning, often fail to preserve boundary precision and structural consistency under these conditions. To address these issues, we propose a Contour-Guided Query Refinement Network (CGQR-Net) for boundary-aware cardiac ultrasound segmentation. The framework integrates multi-resolution feature representations with contour-derived structural priors. An HRNet backbone preserves high-resolution spatial details while capturing multi-scale context. A coarse segmentation is first generated, from which anatomical contours are extracted and encoded into learnable query embeddings. These contour-guided queries interact with fused feature maps via cross-attention, enabling structure-aware refinement that improves boundary delineation and reduces noise artifacts. A dual-head supervision strategy jointly optimizes segmentation and boundary prediction to enforce structural consistency. The proposed method is evaluated on the CAMUS dataset and further validated on the CardiacNet dataset to assess cross-dataset generalization. Experimental results demonstrate improved segmentation accuracy, enhanced boundary precision, and robust performance across varying imaging conditions. These results highlight the effectiveness of integrating contour-level structural information with feature-level representations for reliable cardiac ultrasound segmentation.

Paper Structure

This paper contains 42 sections, 35 equations, 7 figures, 5 tables, 1 algorithm.

Figures (7)

  • Figure 1: Sample echocardiographic images from the CAMUS (top row) and CardiacNet (bottom row) datasets, illustrating differences in image quality, noise levels, and anatomical variability. The CAMUS dataset contains relatively clean echocardiographic images with well-defined cardiac structures, while the CardiacNet dataset exhibits significantly higher variability, including severe speckle noise, low contrast, and irregular anatomical boundaries.
  • Figure 2: Architecture of the proposed CGQR-Net. HRNet extracts multi-resolution features from the input echocardiography image, and a coarse segmentation head provides an initial structural prediction. Contours extracted from the coarse mask are converted into query embeddings and used to refine fused multi-scale features through cross-attention. The refined representation is then passed to segmentation and boundary heads to produce the final boundary-aware multi-class segmentation.
  • Figure 3: Overall workflow of the proposed framework. CAMUS data, including two-chamber (2CH) and four-chamber (4CH) echocardiographic views, are used for training and internal validation, while CardiacNet is used for external validation. After preprocessing, images are fed into the boundary-aware segmentation model, which integrates HRNet, multi-scale feature fusion, and contour-guided query refinement. The model is evaluated using Dice similarity coefficients and qualitative analysis, producing final multi-class segmentation outputs.
  • Figure 4: Overview of the proposed contour-guided query refinement framework. A two-dimensional echocardiography image is first processed by a HRNet backbone to extract multi-resolution features, and a coarse segmentation map is then generated. Contour points extracted from the coarse prediction are converted into query embeddings. These contour-derived queries interact with fused multi-scale features through cross-attention, where the queries provide structural guidance and the fused features provide spatial context. The refined representation is finally passed to dual prediction heads to produce the segmentation output and boundary map. Here, 2D denotes two-dimensional, Conv convolution, BN batch normalization, and ReLU rectified linear unit.
  • Figure 5: Qualitative segmentation results on the CAMUS dataset. From left to right: original image, ground truth, coarse prediction, contour points, boundary map, final prediction, and overlay visualization. The proposed method produces more accurate and sharper boundaries compared to the coarse segmentation.
  • ...and 2 more figures