Active Learning with Context Sampling and One-vs-Rest Entropy for Semantic Segmentation
Fei Wu, Pablo Marquez-Neila, Hedyeh Rafi-Tarii, Raphael Sznitman
TL;DR
This work tackles the high annotation cost of multi-class semantic segmentation and the importance of boundary pixels by introducing OREAL, a patch-based active learning method that combines maximum aggregation of pixel uncertainties with a novel one-vs-rest entropy score for implicit class balancing. It operates on superpixel patches with dominant labeling to minimize annotation effort, and uses a class-balancing-aware annotation strategy to ensure tail classes are well represented. Across four diverse datasets and multiple architectures, OREAL achieves competitive and often superior AuALC and mIoU, with notable gains when using the max aggregation strategy that emphasizes boundary regions. The approach demonstrates that prioritizing context around objects and class-aware uncertainty can substantially reduce labeling cost while improving segmentation performance in real-world settings.
Abstract
Multi-class semantic segmentation remains a cornerstone challenge in computer vision. Yet, dataset creation remains excessively demanding in time and effort, especially for specialized domains. Active Learning (AL) mitigates this challenge by selecting data points for annotation strategically. However, existing patch-based AL methods often overlook boundary pixels critical information, essential for accurate segmentation. We present OREAL, a novel patch-based AL method designed for multi-class semantic segmentation. OREAL enhances boundary detection by employing maximum aggregation of pixel-wise uncertainty scores. Additionally, we introduce one-vs-rest entropy, a novel uncertainty score function that computes class-wise uncertainties while achieving implicit class balancing during dataset creation. Comprehensive experiments across diverse datasets and model architectures validate our hypothesis.
