UNCOVER: Unknown Class Object Detection for Autonomous Vehicles in Real-time
Lars Schmarje, Kaspar Sakman, Reinhard Koch, Dan Zhang
TL;DR
Autonomous driving requires recognizing unknown objects in open-world scenes. The paper introduces UNCOVER, a real-time detector that adds an explicit OOD class and an occupancy-based objectness head, trained with Mosaic+ augmentation from diverse domains to improve generalization while preserving known-class accuracy. A depth-based post-hoc filter further reduces false positives, leveraging geometric cues when depth maps are available. Across Cityscapes, BDD100k, Fishyscapes, and related benchmarks, UNCOVER yields up to 25% improvements in unknown-object recall and 18.4% reductions in false positives, with only a modest impact on runtime, demonstrating practical benefits for safer autonomous driving.
Abstract
Autonomous driving (AD) operates in open-world scenarios, where encountering unknown objects is inevitable. However, standard object detectors trained on a limited number of base classes tend to ignore any unknown objects, posing potential risks on the road. To address this, it is important to learn a generic rather than a class specific objectness from objects seen during training. We therefore introduce an occupancy prediction together with bounding box regression. It learns to score the objectness by calculating the ratio of the predicted area occupied by actual objects. To enhance its generalizability, we increase the object diversity by exploiting data from other domains via Mosaic and Mixup augmentation. The objects outside the AD training classes are classified as a newly added out-of-distribution (OOD) class. Our solution UNCOVER, for UNknown Class Object detection for autonomous VEhicles in Real-time, excels at achieving both real-time detection and high recall of unknown objects on challenging AD benchmarks. To further attain very low false positive rates, particularly for close objects, we introduce a post-hoc filtering step that utilizes geometric cues extracted from the depth map, typically available within the AD system.
