Griffin: Aerial-Ground Cooperative Detection and Tracking Dataset and Benchmark
Jiahao Wang, Xiangyu Cao, Jiaru Zhong, Yuner Zhang, Zeyu Han, Haibao Yu, Chuang Zhang, Lei He, Shaobing Xu, Jianqiang Wang
TL;DR
Griffin addresses occlusion and limited FoV in autonomous perception by introducing a realistic aerial-ground cooperative (AGC) dataset and benchmark. It employs CARLA-AirSim co-simulation to create multi-agent scenes with drone altitudes and occlusion-aware 3D annotations, accompanied by a benchmark that evaluates detection/tracking accuracy, communication efficiency, and robustness to latency and localization noise. The work analyzes multiple fusion paradigms, revealing that instance-level fusion offers better resilience to altitude changes and perturbations, while BEV-level methods are more sensitive to pose and communication errors. These insights guide future directions toward altitude-adaptive fusion, sparse data exchange, and robust sim-to-real transfer for deployable AGC systems.
Abstract
While cooperative perception can overcome the limitations of single-vehicle systems, the practical implementation of vehicle-to-vehicle and vehicle-to-infrastructure systems is often impeded by significant economic barriers. Aerial-ground cooperation (AGC), which pairs ground vehicles with drones, presents a more economically viable and rapidly deployable alternative. However, this emerging field has been held back by a critical lack of high-quality public datasets and benchmarks. To bridge this gap, we present \textit{Griffin}, a comprehensive AGC 3D perception dataset, featuring over 250 dynamic scenes (37k+ frames). It incorporates varied drone altitudes (20-60m), diverse weather conditions, realistic drone dynamics via CARLA-AirSim co-simulation, and critical occlusion-aware 3D annotations. Accompanying the dataset is a unified benchmarking framework for cooperative detection and tracking, with protocols to evaluate communication efficiency, altitude adaptability, and robustness to communication latency, data loss and localization noise. By experiments through different cooperative paradigms, we demonstrate the effectiveness and limitations of current methods and provide crucial insights for future research. The dataset and codes are available at https://github.com/wang-jh18-SVM/Griffin.
