Detecting Every Object from Events
Haitian Zhang, Chang Xu, Xinya Wang, Bingde Liu, Guang Hua, Lei Yu, Wen Yang
TL;DR
This work tackles class-agnostic open-world object detection (CAOD) using event cameras to handle fast-moving objects and challenging illumination. It introduces DEOE, a two-head architecture with a Disentangled Objectness Head and a Dual Regressor Head, leveraging spatio-temporal consistency and potential-sample screening to discover unknown objects in event streams. The approach achieves superior performance over strong baselines across multiple settings and demonstrates strong generalization in cross-dataset tests, while maintaining high inference speeds suitable for real-time perception. This work broadens CAOD to event-based vision and highlights the value of temporal information for open-world object localization in safety-critical applications.
Abstract
Object detection is critical in autonomous driving, and it is more practical yet challenging to localize objects of unknown categories: an endeavour known as Class-Agnostic Object Detection (CAOD). Existing studies on CAOD predominantly rely on ordinary cameras, but these frame-based sensors usually have high latency and limited dynamic range, leading to safety risks in real-world scenarios. In this study, we turn to a new modality enabled by the so-called event camera, featured by its sub-millisecond latency and high dynamic range, for robust CAOD. We propose Detecting Every Object in Events (DEOE), an approach tailored for achieving high-speed, class-agnostic open-world object detection in event-based vision. Built upon the fast event-based backbone: recurrent vision transformer, we jointly consider the spatial and temporal consistencies to identify potential objects. The discovered potential objects are assimilated as soft positive samples to avoid being suppressed as background. Moreover, we introduce a disentangled objectness head to separate the foreground-background classification and novel object discovery tasks, enhancing the model's generalization in localizing novel objects while maintaining a strong ability to filter out the background. Extensive experiments confirm the superiority of our proposed DEOE in comparison with three strong baseline methods that integrate the state-of-the-art event-based object detector with advancements in RGB-based CAOD. Our code is available at https://github.com/Hatins/DEOE.
