Enhanced Radar Perception via Multi-Task Learning: Towards Refined Data for Sensor Fusion Applications
Huawei Sun, Hao Feng, Gianfranco Mauro, Julius Ott, Georg Stettinger, Lorenzo Servadei, Robert Wille
TL;DR
This paper tackles the lack of height information in radar point clouds by learning per-point heights through a 2D height map predicted in the image plane. It introduces a robust regression loss with a dynamic pixel weighting and a multi-task setup that includes free-space segmentation to prevent degenerate zero predictions. The approach yields a substantial reduction in radar height error from $1.69$ m to $0.25$ m and improves downstream radar-camera perception tasks such as object detection and depth estimation when using refined radar data. By enhancing radar data quality, the method strengthens sensor fusion pipelines and enables more reliable perception in autonomous driving scenarios.
Abstract
Radar and camera fusion yields robustness in perception tasks by leveraging the strength of both sensors. The typical extracted radar point cloud is 2D without height information due to insufficient antennas along the elevation axis, which challenges the network performance. This work introduces a learning-based approach to infer the height of radar points associated with 3D objects. A novel robust regression loss is introduced to address the sparse target challenge. In addition, a multi-task training strategy is employed, emphasizing important features. The average radar absolute height error decreases from 1.69 to 0.25 meters compared to the state-of-the-art height extension method. The estimated target height values are used to preprocess and enrich radar data for downstream perception tasks. Integrating this refined radar information further enhances the performance of existing radar camera fusion models for object detection and depth estimation tasks.
