Table of Contents
Fetching ...

Auto-Annotation Quality Prediction for Semi-Supervised Learning with Ensembles

Dror Simon, Miriam Farber, Roman Goldenberg

TL;DR

It is shown that the performance of a state-of-the-art model can be achieved by training it with only a fraction of the original manually labeled samples, and replacing the rest with auto-annotated, quality filtered labels.

Abstract

Auto-annotation by ensemble of models is an efficient method of learning on unlabeled data. Wrong or inaccurate annotations generated by the ensemble may lead to performance degradation of the trained model. To deal with this problem we propose filtering the auto-labeled data using a trained model that predicts the quality of the annotation from the degree of consensus between ensemble models. Using semantic segmentation as an example, we show the advantage of the proposed auto-annotation filtering over training on data contaminated with inaccurate labels. Moreover, our experimental results show that in the case of semantic segmentation, the performance of a state-of-the-art model can be achieved by training it with only a fraction (30$\%$) of the original manually labeled data set, and replacing the rest with the auto-annotated, quality filtered labels.

Auto-Annotation Quality Prediction for Semi-Supervised Learning with Ensembles

TL;DR

It is shown that the performance of a state-of-the-art model can be achieved by training it with only a fraction of the original manually labeled samples, and replacing the rest with auto-annotated, quality filtered labels.

Abstract

Auto-annotation by ensemble of models is an efficient method of learning on unlabeled data. Wrong or inaccurate annotations generated by the ensemble may lead to performance degradation of the trained model. To deal with this problem we propose filtering the auto-labeled data using a trained model that predicts the quality of the annotation from the degree of consensus between ensemble models. Using semantic segmentation as an example, we show the advantage of the proposed auto-annotation filtering over training on data contaminated with inaccurate labels. Moreover, our experimental results show that in the case of semantic segmentation, the performance of a state-of-the-art model can be achieved by training it with only a fraction (30) of the original manually labeled data set, and replacing the rest with the auto-annotated, quality filtered labels.

Paper Structure

This paper contains 14 sections, 2 equations, 1 figure, 5 tables.

Figures (1)

  • Figure 1: Rows - top to bottom: (1) Original image, (2) unfiltered auto-annotations, (3) quality filtered auto-annotations, (4) ground truth annotation. Columns: three different examples. Black pixels in the fourth row represent void classes that are not counted as part of the 19 classes and do not contribute to the mIoU score. Black pixels in the third row represent pixels that are masked out by the quality filter.