Table of Contents
Fetching ...

Semi-Supervised Active Learning for Semantic Segmentation in Unknown Environments Using Informative Path Planning

Julius Rückin, Federico Magistri, Cyrill Stachniss, Marija Popović

TL;DR

This work tackles semantic segmentation for robots operating in unknown environments by marrying sparse human labeling with self-generated pseudo labels through an adaptive, map-guided data-collection strategy. It introduces a probabilistic, multi-layer semantic environment map and an uncertainty-driven frontier planner to collect informative data under budget constraints, followed by semi-supervised retraining that fuses sparse human labels with uncertainty-aware pseudo labels. Key contributions include (i) probabilistic semantic environment mapping, (ii) an adaptive frontier-based planning objective, (iii) a semi-supervised training regime with region-impurity-guided sparse labeling, and (iv) uncertainty-aware pseudo-label generation that outperforms self-supervised baselines while greatly reducing labeling effort. Experimental results on the ISPRS Potsdam dataset show substantial gains over non-targeted labeling and self-supervised methods, with semi-supervised performance approaching fully supervised baselines using only a fraction of human labels, highlighting practical impact for autonomous robotic perception in new environments.

Abstract

Semantic segmentation enables robots to perceive and reason about their environments beyond geometry. Most of such systems build upon deep learning approaches. As autonomous robots are commonly deployed in initially unknown environments, pre-training on static datasets cannot always capture the variety of domains and limits the robot's perception performance during missions. Recently, self-supervised and fully supervised active learning methods emerged to improve a robot's vision. These approaches rely on large in-domain pre-training datasets or require substantial human labelling effort. We propose a planning method for semi-supervised active learning of semantic segmentation that substantially reduces human labelling requirements compared to fully supervised approaches. We leverage an adaptive map-based planner guided towards the frontiers of unexplored space with high model uncertainty collecting training data for human labelling. A key aspect of our approach is to combine the sparse high-quality human labels with pseudo labels automatically extracted from highly certain environment map areas. Experimental results show that our method reaches segmentation performance close to fully supervised approaches with drastically reduced human labelling effort while outperforming self-supervised approaches.

Semi-Supervised Active Learning for Semantic Segmentation in Unknown Environments Using Informative Path Planning

TL;DR

This work tackles semantic segmentation for robots operating in unknown environments by marrying sparse human labeling with self-generated pseudo labels through an adaptive, map-guided data-collection strategy. It introduces a probabilistic, multi-layer semantic environment map and an uncertainty-driven frontier planner to collect informative data under budget constraints, followed by semi-supervised retraining that fuses sparse human labels with uncertainty-aware pseudo labels. Key contributions include (i) probabilistic semantic environment mapping, (ii) an adaptive frontier-based planning objective, (iii) a semi-supervised training regime with region-impurity-guided sparse labeling, and (iv) uncertainty-aware pseudo-label generation that outperforms self-supervised baselines while greatly reducing labeling effort. Experimental results on the ISPRS Potsdam dataset show substantial gains over non-targeted labeling and self-supervised methods, with semi-supervised performance approaching fully supervised baselines using only a fraction of human labels, highlighting practical impact for autonomous robotic perception in new environments.

Abstract

Semantic segmentation enables robots to perceive and reason about their environments beyond geometry. Most of such systems build upon deep learning approaches. As autonomous robots are commonly deployed in initially unknown environments, pre-training on static datasets cannot always capture the variety of domains and limits the robot's perception performance during missions. Recently, self-supervised and fully supervised active learning methods emerged to improve a robot's vision. These approaches rely on large in-domain pre-training datasets or require substantial human labelling effort. We propose a planning method for semi-supervised active learning of semantic segmentation that substantially reduces human labelling requirements compared to fully supervised approaches. We leverage an adaptive map-based planner guided towards the frontiers of unexplored space with high model uncertainty collecting training data for human labelling. A key aspect of our approach is to combine the sparse high-quality human labels with pseudo labels automatically extracted from highly certain environment map areas. Experimental results show that our method reaches segmentation performance close to fully supervised approaches with drastically reduced human labelling effort while outperforming self-supervised approaches.
Paper Structure (13 sections, 4 equations, 9 figures, 1 table)

This paper contains 13 sections, 4 equations, 9 figures, 1 table.

Figures (9)

  • Figure 1: Our semi-supervised active learning approach in an unknown environment (top). We infer semantic segmentation (top-centre) and model uncertainty (top-right) and fuse both in environment maps. The robot re-plans its path (orange, top-left) to collect diverse uncertain images. After each mission, we select sparse sets of pixels for human and self-supervised labelling (bottom). Self-supervised labels are rendered from low-uncertainty semantic map regions. Human labels are queried for regions of cluttered model predictions.
  • Figure 2: During a mission, a semantic segmentation network predicts pixel-wise semantics and model uncertainties from an RGB-D image. Both are fused into an uncertainty-aware semantic environment map (\ref{['SS:mapping']}). Our planner guides the collection of training data for network re-training based on the robot state and map belief (\ref{['SS:planning']}). After a mission, the collected data is labelled using two sources of labels: (i) a human annotator labels a sparse set of informative pixels, and (ii) we automatically render pseudo labels from the semantic map in an uncertainty-aware fashion.
  • Figure 3: Comparison of label selection methods with $\alpha = 1000$ human-labelled pixels per image using our frontier planner on ISPRS Potsdam. Frontier (yellow) and coverage (orange) planners use densely labelled images indicating performance upper bounds. Results are averaged over three runs. Shaded regions indicate one standard deviation. Our proposed method (dark blue) outperforms state-of-the-art pixel selection methods.
  • Figure 4: Qualitative results of our human label pixel selection method on ISPRS Potsdam. Columns from left to right: RGB input, ground truth, prediction, pixels selected for re-training, model uncertainty. Selected pixels are expanded to their one-pixel neighbourhood for visualisation. Our method selects pixels in areas of cluttered predictions, often corresponding to misclassified regions.
  • Figure 5: Comparison of our human label selection method (solid lines) to random label selection (dashed transparent lines) over varying labelling budgets $\alpha \in [100, 10000]\,\text{px}$ using our frontier planner on ISPRS Potsdam. Results are averaged over three runs. Shaded regions indicate one standard deviation. The performance gain of our method drastically increases for lower labelling budgets.
  • ...and 4 more figures