Table of Contents
Fetching ...

Semi-supervised Cervical Segmentation on Ultrasound by A Dual Framework for Neural Networks

Fangyijie Wang, Kathleen M. Curran, Guénolé Silvestre

TL;DR

Scarcity of labeled ultrasound data hampers cervical segmentation accuracy. The authors propose a dual-network semi-supervised framework combining a CNN-based UNet and a Transformer-based Swin-UNet, using cross-supervised pseudo-labeling and a self-supervised contrastive loss to leverage unlabeled data. Key contributions include pixel-level cross-supervision between heterogeneous architectures and InfoNCE-based contrastive learning, validated on a public challenge dataset with competitive DSC and HD metrics and reduced inference time. The work demonstrates label-efficient segmentation in cervical ultrasound and provides openly accessible code for reproducibility.

Abstract

Accurate segmentation of ultrasound (US) images of the cervical muscles is crucial for precision healthcare. The demand for automatic computer-assisted methods is high. However, the scarcity of labeled data hinders the development of these methods. Advanced semi-supervised learning approaches have displayed promise in overcoming this challenge by utilizing labeled and unlabeled data. This study introduces a novel semi-supervised learning (SSL) framework that integrates dual neural networks. This SSL framework utilizes both networks to generate pseudo-labels and cross-supervise each other at the pixel level. Additionally, a self-supervised contrastive learning strategy is introduced, which employs a pair of deep representations to enhance feature learning capabilities, particularly on unlabeled data. Our framework demonstrates competitive performance in cervical segmentation tasks. Our codes are publicly available on https://github.com/13204942/SSL\_Cervical\_Segmentation.

Semi-supervised Cervical Segmentation on Ultrasound by A Dual Framework for Neural Networks

TL;DR

Scarcity of labeled ultrasound data hampers cervical segmentation accuracy. The authors propose a dual-network semi-supervised framework combining a CNN-based UNet and a Transformer-based Swin-UNet, using cross-supervised pseudo-labeling and a self-supervised contrastive loss to leverage unlabeled data. Key contributions include pixel-level cross-supervision between heterogeneous architectures and InfoNCE-based contrastive learning, validated on a public challenge dataset with competitive DSC and HD metrics and reduced inference time. The work demonstrates label-efficient segmentation in cervical ultrasound and provides openly accessible code for reproducibility.

Abstract

Accurate segmentation of ultrasound (US) images of the cervical muscles is crucial for precision healthcare. The demand for automatic computer-assisted methods is high. However, the scarcity of labeled data hinders the development of these methods. Advanced semi-supervised learning approaches have displayed promise in overcoming this challenge by utilizing labeled and unlabeled data. This study introduces a novel semi-supervised learning (SSL) framework that integrates dual neural networks. This SSL framework utilizes both networks to generate pseudo-labels and cross-supervise each other at the pixel level. Additionally, a self-supervised contrastive learning strategy is introduced, which employs a pair of deep representations to enhance feature learning capabilities, particularly on unlabeled data. Our framework demonstrates competitive performance in cervical segmentation tasks. Our codes are publicly available on https://github.com/13204942/SSL\_Cervical\_Segmentation.

Paper Structure

This paper contains 13 sections, 5 equations, 3 figures, 2 tables.

Figures (3)

  • Figure 1: Framework for deep representations contrastive cross-supervised neural networks for semi-supervised ultrasound image segmentation.
  • Figure 2: Plots of the hyperparameter settings. From left to right: (a) Learning rate. (b) Consistency weight $\lambda$ in Equation \ref{['loss']}. (c) Training loss of model $f^2_{\phi}(\cdot)$.
  • Figure 3: Example ultrasound images from our validation set, ground truth (GT), and corresponding segmentation results of Swin-UNet, U-Net, Attention UNet, Efficient UNet, ResNet UNet, and MiT UNet.