Table of Contents
Fetching ...

SS-ADA: A Semi-Supervised Active Domain Adaptation Framework for Semantic Segmentation

Weihao Yan, Yeqiang Qian, Yueyuan Li, Tao Li, Chunxiang Wang, Ming Yang

TL;DR

SS-ADA introduces a semi-supervised active domain adaptation framework for semantic segmentation in driving scenes, uniting a semi-supervised learning module, an image-based active learning component, and an IoU-based class weighting scheme. By querying informative images at targeted epochs and weighting classes by IoU rather than frequency, the method achieves supervised-level accuracy with only 25% of target labels across synthetic-to-real and real-to-real domain shifts, including challenging fisheye camera scenarios. The approach preserves contextual information through image-level acquisitions, demonstrates strong gains over existing ADA methods, and shows robust performance across datasets with varying weather and sensor conditions. This work reduces annotation costs while maintaining high segmentation quality, enabling more practical deployment of vision systems in new driving environments.

Abstract

Semantic segmentation plays an important role in intelligent vehicles, providing pixel-level semantic information about the environment. However, the labeling budget is expensive and time-consuming when semantic segmentation model is applied to new driving scenarios. To reduce the costs, semi-supervised semantic segmentation methods have been proposed to leverage large quantities of unlabeled images. Despite this, their performance still falls short of the accuracy required for practical applications, which is typically achieved by supervised learning. A significant shortcoming is that they typically select unlabeled images for annotation randomly, neglecting the assessment of sample value for model training. In this paper, we propose a novel semi-supervised active domain adaptation (SS-ADA) framework for semantic segmentation that employs an image-level acquisition strategy. SS-ADA integrates active learning into semi-supervised semantic segmentation to achieve the accuracy of supervised learning with a limited amount of labeled data from the target domain. Additionally, we design an IoU-based class weighting strategy to alleviate the class imbalance problem using annotations from active learning. We conducted extensive experiments on synthetic-to-real and real-to-real domain adaptation settings. The results demonstrate the effectiveness of our method. SS-ADA can achieve or even surpass the accuracy of its supervised learning counterpart with only 25% of the target labeled data when using a real-time segmentation model. The code for SS-ADA is available at https://github.com/ywher/SS-ADA.

SS-ADA: A Semi-Supervised Active Domain Adaptation Framework for Semantic Segmentation

TL;DR

SS-ADA introduces a semi-supervised active domain adaptation framework for semantic segmentation in driving scenes, uniting a semi-supervised learning module, an image-based active learning component, and an IoU-based class weighting scheme. By querying informative images at targeted epochs and weighting classes by IoU rather than frequency, the method achieves supervised-level accuracy with only 25% of target labels across synthetic-to-real and real-to-real domain shifts, including challenging fisheye camera scenarios. The approach preserves contextual information through image-level acquisitions, demonstrates strong gains over existing ADA methods, and shows robust performance across datasets with varying weather and sensor conditions. This work reduces annotation costs while maintaining high segmentation quality, enabling more practical deployment of vision systems in new driving environments.

Abstract

Semantic segmentation plays an important role in intelligent vehicles, providing pixel-level semantic information about the environment. However, the labeling budget is expensive and time-consuming when semantic segmentation model is applied to new driving scenarios. To reduce the costs, semi-supervised semantic segmentation methods have been proposed to leverage large quantities of unlabeled images. Despite this, their performance still falls short of the accuracy required for practical applications, which is typically achieved by supervised learning. A significant shortcoming is that they typically select unlabeled images for annotation randomly, neglecting the assessment of sample value for model training. In this paper, we propose a novel semi-supervised active domain adaptation (SS-ADA) framework for semantic segmentation that employs an image-level acquisition strategy. SS-ADA integrates active learning into semi-supervised semantic segmentation to achieve the accuracy of supervised learning with a limited amount of labeled data from the target domain. Additionally, we design an IoU-based class weighting strategy to alleviate the class imbalance problem using annotations from active learning. We conducted extensive experiments on synthetic-to-real and real-to-real domain adaptation settings. The results demonstrate the effectiveness of our method. SS-ADA can achieve or even surpass the accuracy of its supervised learning counterpart with only 25% of the target labeled data when using a real-time segmentation model. The code for SS-ADA is available at https://github.com/ywher/SS-ADA.
Paper Structure (34 sections, 12 equations, 13 figures, 8 tables, 1 algorithm)

This paper contains 34 sections, 12 equations, 13 figures, 8 tables, 1 algorithm.

Figures (13)

  • Figure 1: The performance of joint training, supervised and semi-supervised learning, active domain adaptation and SS-ADA on synthetic-to-real and real-to-real domain adaptation settings.
  • Figure 2: The training framework of SS-ADA for semantic segmentation. The semi-supervised learning module leverages source labeled data, target labeled data, and target unlabeled data for training. The active learning module is triggered at specified training epochs $T_{ac}$ to evaluate the value of the unlabeled data, select a portion of them for manual annotation, and update the target domain data accordingly. IoU-based class weighting strategy is applied after active learning module and calculate the class weights. After that, the training process goes back to semi-supervised learning module.
  • Figure 3: Illustration of annotation forms for ADA. From top to bottom: original images, pixel/region-level (annotating 25% of the pixels in each image), and image-level (annotating 25% of the images). Colored regions indicate the areas to be annotated, while white regions represent the unlabeled areas.
  • Figure 4: The class imbalance problem in GTA5-to-Cityscapes.
  • Figure 5: Some examples of the FishEyeCampus dataset.
  • ...and 8 more figures