Table of Contents
Fetching ...

Label Filling via Mixed Supervision for Medical Image Segmentation from Noisy Annotations

Ming Li, Wei Shen, Qingli Li, Yan Wang

TL;DR

Results on five datasets of diverse imaging modalities show that the LF-Net boosts segmentation accuracy in all datasets compared with state-of-the-art methods, with even a 7% improvement in DSC for MS lesion segmentation.

Abstract

The success of medical image segmentation usually requires a large number of high-quality labels. But since the labeling process is usually affected by the raters' varying skill levels and characteristics, the estimated masks provided by different raters usually suffer from high inter-rater variability. In this paper, we propose a simple yet effective Label Filling framework, termed as LF-Net, predicting the groundtruth segmentation label given only noisy annotations during training. The fundamental idea of label filling is to supervise the segmentation model by a subset of pixels with trustworthy labels, meanwhile filling labels of other pixels by mixed supervision. More concretely, we propose a qualified majority voting strategy, i.e., a threshold voting scheme is designed to model agreement among raters and the majority-voted labels of the selected subset of pixels are regarded as supervision. To fill labels of other pixels, two types of mixed auxiliary supervision are proposed: a soft label learned from intrinsic structures of noisy annotations, and raters' characteristics labels which propagate individual rater's characteristics information. LF-Net has two main advantages. 1) Training with trustworthy pixels incorporates training with confident supervision, guiding the direction of groundtruth label learning. 2) Two types of mixed supervision prevent over-fitting issues when the network is supervised by a subset of pixels, and guarantee high fidelity with the true label. Results on five datasets of diverse imaging modalities show that our LF-Net boosts segmentation accuracy in all datasets compared with state-of-the-art methods, with even a 7% improvement in DSC for MS lesion segmentation.

Label Filling via Mixed Supervision for Medical Image Segmentation from Noisy Annotations

TL;DR

Results on five datasets of diverse imaging modalities show that the LF-Net boosts segmentation accuracy in all datasets compared with state-of-the-art methods, with even a 7% improvement in DSC for MS lesion segmentation.

Abstract

The success of medical image segmentation usually requires a large number of high-quality labels. But since the labeling process is usually affected by the raters' varying skill levels and characteristics, the estimated masks provided by different raters usually suffer from high inter-rater variability. In this paper, we propose a simple yet effective Label Filling framework, termed as LF-Net, predicting the groundtruth segmentation label given only noisy annotations during training. The fundamental idea of label filling is to supervise the segmentation model by a subset of pixels with trustworthy labels, meanwhile filling labels of other pixels by mixed supervision. More concretely, we propose a qualified majority voting strategy, i.e., a threshold voting scheme is designed to model agreement among raters and the majority-voted labels of the selected subset of pixels are regarded as supervision. To fill labels of other pixels, two types of mixed auxiliary supervision are proposed: a soft label learned from intrinsic structures of noisy annotations, and raters' characteristics labels which propagate individual rater's characteristics information. LF-Net has two main advantages. 1) Training with trustworthy pixels incorporates training with confident supervision, guiding the direction of groundtruth label learning. 2) Two types of mixed supervision prevent over-fitting issues when the network is supervised by a subset of pixels, and guarantee high fidelity with the true label. Results on five datasets of diverse imaging modalities show that our LF-Net boosts segmentation accuracy in all datasets compared with state-of-the-art methods, with even a 7% improvement in DSC for MS lesion segmentation.

Paper Structure

This paper contains 23 sections, 10 equations, 10 figures, 4 tables, 1 algorithm.

Figures (10)

  • Figure 1: A typical label acquisition process. Images and annotation maps are from LIDC-IDRI dataset SG2011the.
  • Figure 2: A glimpse of designed architectures. (a) A segmentation network with the proposed QMV strategy. (b) A segmentation network with mixed supervision. QMV means qualified majority voting.
  • Figure 3: The training stage of LF-Net. To guide the training process in line with the groundtruth, we propose a qualified majority voting strategy (a threshold voting scheme selects pixels whose majority-voted labels are trustworthy). First, a soft label learning network is trained. Then, to assist the learning of the segmentation model and fill correct labels of other pixels, two types of auxiliary supervision are designed: (1) the soft label obtained from soft label learning network is armed with information from intrinsic structures of noisy annotations, and (2) rater characteristics labels (i.e., noisy annotations) disentangling human errors from filled label by rater characteristics net module, so as to help the label filling when back-propagating.
  • Figure 4: Visualizations on groundtruth (GT) provided by senior doctors and noisy segmentation labels. Examples from LIDC-IDRI and RIGA dasets are shown respectively.
  • Figure 5: Visualizations on five noisy segmentation labels. Original image and groundtruth are also shown on the left side of the dotted line as reference.
  • ...and 5 more figures