AttEntropy: On the Generalization Ability of Supervised Semantic Segmentation Transformers to New Objects in New Domains
Krzysztof Lis, Matthias Rottmann, Annika Mütze, Sina Honari, Pascal Fua, Mathieu Salzmann
TL;DR
The paper tackles the problem of segmenting unseen objects in new domains using vision transformers trained for semantic segmentation. It introduces AttEntropy, a method that converts intermediate spatial attention maps into entropy heatmaps via the Shannon entropy $E^l(Z^{l-1})_j = - \sum_{j'} \bar{A}^{l}(Z^{l-1})_{j,j'} \log \bar{A}^{l}(Z^{l-1})_{j,j'}$, enabling segmentation of never-seen-before categories. The authors validate AttEntropy across multiple backbones and datasets, including Cityscapes-based models and broader domains like Lunar, Maritime, and Aircraft, and demonstrate robustness under no, partial, and complete domain shifts. The results show that a training-free entropy-based cue can approach, and in some cases compete with, training-based obstacle/detection methods while incurring negligible overhead, with automatic layer selection further enhancing performance. The work suggests a practical pathway for open-world segmentation and pre-segmentation in robotics and autonomous driving, enabling rapid adaptation to new object categories without additional training.
Abstract
In addition to impressive performance, vision transformers have demonstrated remarkable abilities to encode information they were not trained to extract. For example, this information can be used to perform segmentation or single-view depth estimation even though the networks were only trained for image recognition. We show that a similar phenomenon occurs when explicitly training transformers for semantic segmentation in a supervised manner for a set of categories: Once trained, they provide valuable information even about categories absent from the training set. This information can be used to segment objects from these never-seen-before classes in domains as varied as road obstacles, aircraft parked at a terminal, lunar rocks, and maritime hazards.
