MR-GDINO: Efficient Open-World Continual Object Detection
Bowen Dong, Zitong Huang, Guanglei Yang, Lei Zhang, Wangmeng Zuo
TL;DR
This work defines Open-World Continual Object Detection (OW-COD) and introduces the OW-COD benchmark to evaluate detectors on old, new, and unseen categories under few-shot continual updates. It proposes MR-GDINO, a memory-based baseline built on a frozen open-world detector, which uses two compact memories (concept memory and VL interaction memory) and a retrieval mechanism over a scalable memory pool to select the best memories per input. Experiments show that existing continual detectors suffer severe forgetting for unseen categories, while MR-GDINO substantially mitigates forgetting with only about 0.1% additional parameters, achieving state-of-the-art performance on old, new, and unseen categories. The approach offers a flexible, scalable, and efficient pathway toward robust open-world continual detection suitable for real-world deployment.
Abstract
Open-world (OW) recognition and detection models show strong zero- and few-shot adaptation abilities, inspiring their use as initializations in continual learning methods to improve performance. Despite promising results on seen classes, such OW abilities on unseen classes are largely degenerated due to catastrophic forgetting. To tackle this challenge, we propose an open-world continual object detection task, requiring detectors to generalize to old, new, and unseen categories in continual learning scenarios. Based on this task, we present a challenging yet practical OW-COD benchmark to assess detection abilities. The goal is to motivate OW detectors to simultaneously preserve learned classes, adapt to new classes, and maintain open-world capabilities under few-shot adaptations. To mitigate forgetting in unseen categories, we propose MR-GDINO, a strong, efficient and scalable baseline via memory and retrieval mechanisms within a highly scalable memory pool. Experimental results show that existing continual detectors suffer from severe forgetting for both seen and unseen categories. In contrast, MR-GDINO largely mitigates forgetting with only 0.1% activated extra parameters, achieving state-of-the-art performance for old, new, and unseen categories.
