Automatic Labelling & Semantic Segmentation with 4D Radar Tensors
Botao Sun, Ignacio Roldan, Francesco Fioranelli
TL;DR
This paper addresses the scarcity of semantic segmentation methods for automotive 4D radar by proposing a direct segmentation approach on RAED tensors and an automatic labeling pipeline that fuses LiDAR, camera, and clustering cues to produce ground-truth labels for RaDelft. The labeling pipeline generates point-wise multi-class labels through preliminary LiDAR detections, camera-based calibration, and voxel-wise transformation to a radar grid, enabling radar-ground-truth for training. The segmentation network converts RAED data to a Range-Azimuth-Elevation representation, uses dual 2D backbones to form occupancy and class latent spaces, and employs a 3D U-Net to predict per-voxel class probabilities, trained with a mix of weighted cross-entropy and soft-dice losses. On RaDelft, the method achieves over 65% of LiDAR detection performance, improves vehicle detection probability by about 13.2%, and reduces Chamfer distance by 0.54 m relative to literature variants, demonstrating the practicality of radar-based semantic segmentation for robust ADAS perception.
Abstract
In this paper, an automatic labelling process is presented for automotive datasets, leveraging on complementary information from LiDAR and camera. The generated labels are then used as ground truth with the corresponding 4D radar data as inputs to a proposed semantic segmentation network, to associate a class label to each spatial voxel. Promising results are shown by applying both approaches to the publicly shared RaDelft dataset, with the proposed network achieving over 65% of the LiDAR detection performance, improving 13.2% in vehicle detection probability, and reducing 0.54 m in terms of Chamfer distance, compared to variants inspired from the literature.
