Table of Contents
Fetching ...

Semi-Supervised Segmentation via Embedding Matching

Weiyi Xie, Nathalie Willems, Nikolas Lessmann, Tom Gibbons, Daniele De Massari

TL;DR

This work tackles the high labeling cost of 3D medical image segmentation by presenting a semi-supervised framework that combines uncertainty-driven pseudo-labeling with embedding-based label propagation in a teacher–student setting. The method leverages Monte Carlo Dropout to identify reliable teacher predictions and propagates labels to uncertain voxels via voxel-wise embedding matching, augmented by an entropy-minimization regularizer to sharpen class separation. Across hip-bone CT segmentation, the approach delivers state-of-the-art performance with only a small amount of labeled data (e.g., $50$ patches from $4$ CT scans), achieving $HD95=3.30$ mm and $IoU=0.929$. The results demonstrate robust pseudo-label coverage and improved segmentation under limited supervision, with clear guidance on when more labeled data are needed to capture anatomical and artifact-related variations.

Abstract

Deep convolutional neural networks are widely used in medical image segmentation but require many labeled images for training. Annotating three-dimensional medical images is a time-consuming and costly process. To overcome this limitation, we propose a novel semi-supervised segmentation method that leverages mostly unlabeled images and a small set of labeled images in training. Our approach involves assessing prediction uncertainty to identify reliable predictions on unlabeled voxels from the teacher model. These voxels serve as pseudo-labels for training the student model. In voxels where the teacher model produces unreliable predictions, pseudo-labeling is carried out based on voxel-wise embedding correspondence using reference voxels from labeled images. We applied this method to automate hip bone segmentation in CT images, achieving notable results with just 4 CT scans. The proposed approach yielded a Hausdorff distance with 95th percentile (HD95) of 3.30 and IoU of 0.929, surpassing existing methods achieving HD95 (4.07) and IoU (0.927) at their best.

Semi-Supervised Segmentation via Embedding Matching

TL;DR

This work tackles the high labeling cost of 3D medical image segmentation by presenting a semi-supervised framework that combines uncertainty-driven pseudo-labeling with embedding-based label propagation in a teacher–student setting. The method leverages Monte Carlo Dropout to identify reliable teacher predictions and propagates labels to uncertain voxels via voxel-wise embedding matching, augmented by an entropy-minimization regularizer to sharpen class separation. Across hip-bone CT segmentation, the approach delivers state-of-the-art performance with only a small amount of labeled data (e.g., patches from CT scans), achieving mm and . The results demonstrate robust pseudo-label coverage and improved segmentation under limited supervision, with clear guidance on when more labeled data are needed to capture anatomical and artifact-related variations.

Abstract

Deep convolutional neural networks are widely used in medical image segmentation but require many labeled images for training. Annotating three-dimensional medical images is a time-consuming and costly process. To overcome this limitation, we propose a novel semi-supervised segmentation method that leverages mostly unlabeled images and a small set of labeled images in training. Our approach involves assessing prediction uncertainty to identify reliable predictions on unlabeled voxels from the teacher model. These voxels serve as pseudo-labels for training the student model. In voxels where the teacher model produces unreliable predictions, pseudo-labeling is carried out based on voxel-wise embedding correspondence using reference voxels from labeled images. We applied this method to automate hip bone segmentation in CT images, achieving notable results with just 4 CT scans. The proposed approach yielded a Hausdorff distance with 95th percentile (HD95) of 3.30 and IoU of 0.929, surpassing existing methods achieving HD95 (4.07) and IoU (0.927) at their best.
Paper Structure (15 sections, 6 equations, 2 figures, 4 tables)

This paper contains 15 sections, 6 equations, 2 figures, 4 tables.

Figures (2)

  • Figure 1: Semi-supervised Semantic Segmentation Framework for training with unlabeled images. The teacher model produces pseudo labels for training the student model via uncertainty analysis and nearest-neighbor matching.
  • Figure 2: Qualitative results of the proposed methods trained with 50, 150, 300, 1000, and 2500 labeled patches (from the $3^{rd}$ column to $7^{th}$ column) compared with the ground truth ($1^{st}$ column) and the fully-supervised baseline ($2^{nd}$ column). We show a 3D patch using its central axial slice. By adding more labeled examples, the proposed method converges to manual labels, although still keeping consistent errors in segmenting metal artifacts ($2^{nd}$ row) and identifying local anatomical structures that are ambiguous in manual segmentation ($3^{rd}$ row).