OpenAD: Open-World Autonomous Driving Benchmark for 3D Object Detection

Zhongyu Xia; Jishuo Li; Zhiwei Lin; Xinhao Wang; Yongtao Wang; Ming-Hsuan Yang

OpenAD: Open-World Autonomous Driving Benchmark for 3D Object Detection

Zhongyu Xia, Jishuo Li, Zhiwei Lin, Xinhao Wang, Yongtao Wang, Ming-Hsuan Yang

TL;DR

OpenAD tackles the challenge of open-world perception in autonomous driving by introducing the first real-world benchmark for 3D object detection that jointly evaluates domain generalization and open-ended understanding. It presents a corner-case discovery and annotation pipeline that leverages multimodal large language models to annotate corner-case objects across five datasets, creating 2,000 scenes and 19,761 total objects spanning 206 categories. The authors propose a vision-centric 3D open-ended detection baseline that converts 2D proposals into 3D boxes and a fusion approach that combines open-world and specialized models to balance precision and generalization. Experimental results show open-world models excel in generalization but lag in in-domain accuracy, while the proposed ensemble and vision-centric baselines achieve strong performance on OpenAD, highlighting practical benefits for robust, open-world autonomous driving perception.

Abstract

Open-world perception aims to develop a model adaptable to novel domains and various sensor configurations and can understand uncommon objects and corner cases. However, current research lacks sufficiently comprehensive open-world 3D perception benchmarks and robust generalizable methodologies. This paper introduces OpenAD, the first real open-world autonomous driving benchmark for 3D object detection. OpenAD is built upon a corner case discovery and annotation pipeline that integrates with a multimodal large language model (MLLM). The proposed pipeline annotates corner case objects in a unified format for five autonomous driving perception datasets with 2000 scenarios. In addition, we devise evaluation methodologies and evaluate various open-world and specialized 2D and 3D models. Moreover, we propose a vision-centric 3D open-world object detection baseline and further introduce an ensemble method by fusing general and specialized models to address the issue of lower precision in existing open-world methods for the OpenAD benchmark. We host an online challenge on EvalAI. Data, toolkit codes, and evaluation codes are available at https://github.com/VDIGPKU/OpenAD.

OpenAD: Open-World Autonomous Driving Benchmark for 3D Object Detection

TL;DR

Abstract

OpenAD: Open-World Autonomous Driving Benchmark for 3D Object Detection

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)