Enhancing mmWave Radar Point Cloud via Visual-inertial Supervision
Cong Fan, Shengkai Zhang, Kezhong Liu, Shuai Wang, Zheng Yang, Wei Wang
TL;DR
mmEMP tackles sparse mmWave radar point clouds by leveraging visual-inertial supervision from low-cost cameras and IMUs to densify radar data without LiDAR; it introduces a dynamic VI 3D reconstruction to recover moving feature positions and a VI-informed refinement pipeline to remove spurious multipath points while densifying the radar point cloud. The method uses a non-linear least-squares formulation for moving features, a GAN-based densification of range-Doppler inputs, and a rigid-transformation learning/space-time stability mechanism to align adjacent frames. A large real-world dataset of radar RDMs, images, and IMU data demonstrates that mmEMP achieves performance competitive with LiDAR-supervised state-of-the-art in point-density and geometry metrics, and yields tangible improvements in object detection, localization, and mapping. This approach enables crowdsourced training on commercial vehicles, potentially lowering the cost barrier for robust radar-based perception in adverse weather conditions.
Abstract
Complementary to prevalent LiDAR and camera systems, millimeter-wave (mmWave) radar is robust to adverse weather conditions like fog, rainstorms, and blizzards but offers sparse point clouds. Current techniques enhance the point cloud by the supervision of LiDAR's data. However, high-performance LiDAR is notably expensive and is not commonly available on vehicles. This paper presents mmEMP, a supervised learning approach that enhances radar point clouds using a low-cost camera and an inertial measurement unit (IMU), enabling crowdsourcing training data from commercial vehicles. Bringing the visual-inertial (VI) supervision is challenging due to the spatial agnostic of dynamic objects. Moreover, spurious radar points from the curse of RF multipath make robots misunderstand the scene. mmEMP first devises a dynamic 3D reconstruction algorithm that restores the 3D positions of dynamic features. Then, we design a neural network that densifies radar data and eliminates spurious radar points. We build a new dataset in the real world. Extensive experiments show that mmEMP achieves competitive performance compared with the SOTA approach training by LiDAR's data. In addition, we use the enhanced point cloud to perform object detection, localization, and mapping to demonstrate mmEMP's effectiveness.
