SOS: Segment Object System for Open-World Instance Segmentation With Object Priors
Christian Wilms, Tim Rolff, Maris Hillemann, Robert Johanson, Simone Frintrop
TL;DR
This work tackles Open-World Instance Segmentation by building SOS, a three-part system that prompts SAM with an object-focused prior to generate high-quality pseudo annotations. A key contribution is identifying DINO self-attention maps as the strongest object prior for focusing SAM on objects, which markedly improves precision when training a standard Mask R-CNN on mixed original and pseudo annotations. Extensive cross-dataset and cross-category evaluations on COCO, LVIS, and ADE20k demonstrate strong generalization to unseen object classes, with precision improvements up to 81.6% over prior methods. Overall, SOS offers a practical, plug-and-play approach to OWIS that leverages foundation models to boost localization quality and segmentation performance without requiring extra supervision.
Abstract
We propose an approach for Open-World Instance Segmentation (OWIS), a task that aims to segment arbitrary unknown objects in images by generalizing from a limited set of annotated object classes during training. Our Segment Object System (SOS) explicitly addresses the generalization ability and the low precision of state-of-the-art systems, which often generate background detections. To this end, we generate high-quality pseudo annotations based on the foundation model SAM. We thoroughly study various object priors to generate prompts for SAM, explicitly focusing the foundation model on objects. The strongest object priors were obtained by self-attention maps from self-supervised Vision Transformers, which we utilize for prompting SAM. Finally, the post-processed segments from SAM are used as pseudo annotations to train a standard instance segmentation system. Our approach shows strong generalization capabilities on COCO, LVIS, and ADE20k datasets and improves on the precision by up to 81.6% compared to the state-of-the-art. Source code is available at: https://github.com/chwilms/SOS
