Timely Fusion of Surround Radar/Lidar for Object Detection in Autonomous Driving Systems
Wenjing Xie, Tao Hu, Neiwen Ling, Guoliang Xing, Chun Jason Xue, Nan Guan
TL;DR
The paper tackles real-time perception for autonomous driving by addressing the limited fusion frequency in surround Radar/Lidar systems. It introduces timely fusion that pairs each new Lidar frame with the latest Radar frame, enabling fusion at a frequency $F_f$ with $F_r \le F_f \le F_l$ and $\alpha = F_l / F_f$ in $\{1,2,\dots, ratio\}$, where $ratio = floor(F_l / F_r)$. Building on the MVDNet framework, it introduces training-time enhancements to handle temporal unalignment, including offset-aware synthesis and strategies for separate or mixed training, plus historical-frame usage and alignment techniques. Experimental results show substantial increases in fusion frequency with minimal loss in 3D object detection accuracy across various offsets on the ORR dataset, demonstrating practical gains for real-time autonomous driving perception.
Abstract
Fusing Radar and Lidar sensor data can fully utilize their complementary advantages and provide more accurate reconstruction of the surrounding for autonomous driving systems. Surround Radar/Lidar can provide 360-degree view sampling with the minimal cost, which are promising sensing hardware solutions for autonomous driving systems. However, due to the intrinsic physical constraints, the rotating speed of surround Radar, and thus the frequency to generate Radar data frames, is much lower than surround Lidar. Existing Radar/Lidar fusion methods have to work at the low frequency of surround Radar, which cannot meet the high responsiveness requirement of autonomous driving systems.This paper develops techniques to fuse surround Radar/Lidar with working frequency only limited by the faster surround Lidar instead of the slower surround Radar, based on the state-of-the-art object detection model MVDNet. The basic idea of our approach is simple: we let MVDNet work with temporally unaligned data from Radar/Lidar, so that fusion can take place at any time when a new Lidar data frame arrives, instead of waiting for the slow Radar data frame. However, directly applying MVDNet to temporally unaligned Radar/Lidar data greatly degrades its object detection accuracy. The key information revealed in this paper is that we can achieve high output frequency with little accuracy loss by enhancing the training procedure to explore the temporal redundancy in MVDNet so that it can tolerate the temporal unalignment of input data. We explore several different ways of training enhancement and compare them quantitatively with experiments.
