Open-World Semantic Segmentation Including Class Similarity
Matteo Sodano, Federico Magistri, Lucas Nunes, Jens Behley, Cyrill Stachniss
TL;DR
This work tackles open-world semantic segmentation by jointly handling anomaly segmentation and novel class discovery within a single, lightweight encoder–decoder network. It introduces a dual-decoder architecture: a semantic decoder that learns per-class feature descriptors to push known-class features toward fixed prototypes, and a contrastive decoder that, via the objectosphere and contrastive losses, separates unknown regions while enabling pixel-level anomaly scoring. Unknown pixels are clustered into new classes using a post-processing phase that stores activation vectors and updates class prototypes, while a Gaussian similarity model provides a measure of how unknowns relate to known categories. Extensive experiments on SegmentMeIfYouCan and BDDAnomaly show state-of-the-art anomaly segmentation performance, strong open-world segmentation capability, and credible class-similarity predictions, with ablations confirming the value of the feature-space losses and Gaussian post-processing. Overall, the approach advances practical open-world scene understanding with robust detection of novel objects and meaningful similarity signals for downstream planning and mapping.
Abstract
Interpreting camera data is key for autonomously acting systems, such as autonomous vehicles. Vision systems that operate in real-world environments must be able to understand their surroundings and need the ability to deal with novel situations. This paper tackles open-world semantic segmentation, i.e., the variant of interpreting image data in which objects occur that have not been seen during training. We propose a novel approach that performs accurate closed-world semantic segmentation and, at the same time, can identify new categories without requiring any additional training data. Our approach additionally provides a similarity measure for every newly discovered class in an image to a known category, which can be useful information in downstream tasks such as planning or mapping. Through extensive experiments, we show that our model achieves state-of-the-art results on classes known from training data as well as for anomaly segmentation and can distinguish between different unknown classes.
