Table of Contents
Fetching ...

Guidelines for Cerebrovascular Segmentation: Managing Imperfect Annotations in the context of Semi-Supervised Learning

Pierre Rougé, Pierre-Henri Conze, Nicolas Passat, Odyssée Merveille

TL;DR

This study tackles cerebrovascular segmentation under imperfect annotations and limited labeled data by systematically evaluating five unsupervised regularization semi-supervised methods (Mean-Teacher, UA-MT, SASSnet, Dual-Task Consistency, MC-Net) against a supervised baseline on Bullitt and IXI TOF-MRA datasets. It uses $DSC$ and clDice metrics to quantify performance and analyzes the effects of labeled-data quantity and annotation quality. Key findings show semi-supervised methods outperform supervision in low-label regimes and provide regularization, but gains diminish with more labeled data; annotation quality and concept shift have outsized impacts, sometimes outweighing data quantity. The authors derive practical guidelines emphasizing precise annotation policies—especially border delineation—and cautioning against concept shift when combining datasets, thereby informing both data curation and method selection for robust cerebrovascular segmentation.

Abstract

Segmentation in medical imaging is an essential and often preliminary task in the image processing chain, driving numerous efforts towards the design of robust segmentation algorithms. Supervised learning methods achieve excellent performances when fed with a sufficient amount of labeled data. However, such labels are typically highly time-consuming, error-prone and expensive to produce. Alternatively, semi-supervised learning approaches leverage both labeled and unlabeled data, and are very useful when only a small fraction of the dataset is labeled. They are particularly useful for cerebrovascular segmentation, given that labeling a single volume requires several hours for an expert. In addition to the challenge posed by insufficient annotations, there are concerns regarding annotation consistency. The task of annotating the cerebrovascular tree is inherently ambiguous. Due to the discrete nature of images, the borders and extremities of vessels are often unclear. Consequently, annotations heavily rely on the expert subjectivity and on the underlying clinical objective. These discrepancies significantly increase the complexity of the segmentation task for the model and consequently impair the results. Consequently, it becomes imperative to provide clinicians with precise guidelines to improve the annotation process and construct more uniform datasets. In this article, we investigate the data dependency of deep learning methods within the context of imperfect data and semi-supervised learning, for cerebrovascular segmentation. Specifically, this study compares various state-of-the-art semi-supervised methods based on unsupervised regularization and evaluates their performance in diverse quantity and quality data scenarios. Based on these experiments, we provide guidelines for the annotation and training of cerebrovascular segmentation models.

Guidelines for Cerebrovascular Segmentation: Managing Imperfect Annotations in the context of Semi-Supervised Learning

TL;DR

This study tackles cerebrovascular segmentation under imperfect annotations and limited labeled data by systematically evaluating five unsupervised regularization semi-supervised methods (Mean-Teacher, UA-MT, SASSnet, Dual-Task Consistency, MC-Net) against a supervised baseline on Bullitt and IXI TOF-MRA datasets. It uses and clDice metrics to quantify performance and analyzes the effects of labeled-data quantity and annotation quality. Key findings show semi-supervised methods outperform supervision in low-label regimes and provide regularization, but gains diminish with more labeled data; annotation quality and concept shift have outsized impacts, sometimes outweighing data quantity. The authors derive practical guidelines emphasizing precise annotation policies—especially border delineation—and cautioning against concept shift when combining datasets, thereby informing both data curation and method selection for robust cerebrovascular segmentation.

Abstract

Segmentation in medical imaging is an essential and often preliminary task in the image processing chain, driving numerous efforts towards the design of robust segmentation algorithms. Supervised learning methods achieve excellent performances when fed with a sufficient amount of labeled data. However, such labels are typically highly time-consuming, error-prone and expensive to produce. Alternatively, semi-supervised learning approaches leverage both labeled and unlabeled data, and are very useful when only a small fraction of the dataset is labeled. They are particularly useful for cerebrovascular segmentation, given that labeling a single volume requires several hours for an expert. In addition to the challenge posed by insufficient annotations, there are concerns regarding annotation consistency. The task of annotating the cerebrovascular tree is inherently ambiguous. Due to the discrete nature of images, the borders and extremities of vessels are often unclear. Consequently, annotations heavily rely on the expert subjectivity and on the underlying clinical objective. These discrepancies significantly increase the complexity of the segmentation task for the model and consequently impair the results. Consequently, it becomes imperative to provide clinicians with precise guidelines to improve the annotation process and construct more uniform datasets. In this article, we investigate the data dependency of deep learning methods within the context of imperfect data and semi-supervised learning, for cerebrovascular segmentation. Specifically, this study compares various state-of-the-art semi-supervised methods based on unsupervised regularization and evaluates their performance in diverse quantity and quality data scenarios. Based on these experiments, we provide guidelines for the annotation and training of cerebrovascular segmentation models.
Paper Structure (21 sections, 17 equations, 11 figures, 1 table)

This paper contains 21 sections, 17 equations, 11 figures, 1 table.

Figures (11)

  • Figure 1: Illustration of noisy labels and concept shift problem arising on two cerebrovascular datasets : Bullitt aylward2002initialization (left) and IXI ixidataset (right). Top row : 2D slices with labels overlaid in light red, vessels with ambiguous boundaries indicated by green arrows and noisy labels (such as missing vessels or vessel disconnections), marked by red arrows. Bottom row: 3D view showing the disparities in the extent of the labels for the same global annotation policy "labeling all cerebrovascular arteries".
  • Figure 2: Visualization for three different patients of the different deteriorations carried out on the Bullitt dataset to investigate the impact of concept-shift in cerebrovascular segmentation.
  • Figure 3: Illustration of the mean-teacher model leveraging both labeled and unlabeled data for cerebrovascular segmentation.
  • Figure 4: Visual results for Bullitt dataset from experiment 1. Segmentation results are presented for two patients, each method and two dataset compositions.
  • Figure 5: Visual results for IXI dataset from experiment 1. Segmentation results are presented for two patients, each method and two data dataset compositions.
  • ...and 6 more figures