DeepSense-V2V: A Vehicle-to-Vehicle Multi-Modal Sensing, Localization, and Communications Dataset
Joao Morais, Gouranga Charan, Nikhil Srinivas, Ahmed Alkhateeb
TL;DR
DeepSense V2V delivers the first large-scale, real-world, multi-modal V2V dataset combining mmWave communication with co-located sensing (camera, radar, LiDAR) and precise GPS. The authors describe a two-vehicle testbed, a three-stage data pipeline (collection, processing, visualization), and comprehensive statistics across diverse urban and intercity scenarios, enabling both sensing and communication research. A key demonstration shows position-based beam prediction where AoA estimated from relative vehicle positions correlates with the optimal mmWave beam, achieving robust top-$k$ accuracy across scenarios. This dataset supports exploration of sensing-aided beamforming, blockage prediction, and robust localization, with practical implications for high-rate, low-latency V2V and future 6G–beyond autonomous networks.
Abstract
High data rate and low-latency vehicle-to-vehicle (V2V) communication are essential for future intelligent transport systems to enable coordination, enhance safety, and support distributed computing and intelligence requirements. Developing effective communication strategies, however, demands realistic test scenarios and datasets. This is important at the high-frequency bands where more spectrum is available, yet harvesting this bandwidth is challenged by the need for direction transmission and the sensitivity of signal propagation to blockages. This work presents the first large-scale multi-modal dataset for studying mmWave vehicle-to-vehicle communications. It presents a two-vehicle testbed that comprises data from a 360-degree camera, four radars, four 60 GHz phased arrays, a 3D lidar, and two precise GPSs. The dataset contains vehicles driving during the day and night for 120 km in intercity and rural settings, with speeds up to 100 km per hour. More than one million objects were detected across all images, from trucks to bicycles. This work further includes detailed dataset statistics that prove the coverage of various situations and highlights how this dataset can enable novel machine-learning applications.
