Knowledge Distillation for Semantic Segmentation: A Label Space Unification Approach
Anton Backhaus, Thorsten Luettel, Mirko Maehlisch
TL;DR
This work tackles the challenge of divergent taxonomies across semantic segmentation datasets by introducing a label-space unification framework built on knowledge distillation. A teacher model trained on a source taxonomy generates ontology-constrained pseudo-labels for related datasets, enabling training of a student on a unified dataset that outperforms the teacher in urban and off-road driving tasks. The approach yields large composite datasets and demonstrates that larger models and domain-aware priors significantly boost performance, with robust gains on generalization benchmarks like WildDash, though benefits on some source datasets may vary. Overall, the method provides a simple, architecture-agnostic mechanism to leverage heterogeneous autonomous driving data without re-labeling or extensive model redesign, advancing practical data efficiency and generalization in semantic segmentation.
Abstract
An increasing number of datasets sharing similar domains for semantic segmentation have been published over the past few years. But despite the growing amount of overall data, it is still difficult to train bigger and better models due to inconsistency in taxonomy and/or labeling policies of different datasets. To this end, we propose a knowledge distillation approach that also serves as a label space unification method for semantic segmentation. In short, a teacher model is trained on a source dataset with a given taxonomy, then used to pseudo-label additional data for which ground truth labels of a related label space exist. By mapping the related taxonomies to the source taxonomy, we create constraints within which the model can predict pseudo-labels. Using the improved pseudo-labels we train student models that consistently outperform their teachers in two challenging domains, namely urban and off-road driving. Our ground truth-corrected pseudo-labels span over 12 and 7 public datasets with 388.230 and 18.558 images for the urban and off-road domains, respectively, creating the largest compound datasets for autonomous driving to date.
