Table of Contents
Fetching ...

RadarGaussianDet3D: Gaussian Representation-based Real-time 3D Object Detection with 4D Automotive Radars

Weiyi Xiong, Bing Zhu, Zewei Zheng

Abstract

4D automotive radars have gained increasing attention for autonomous driving due to their low cost, robustness, and inherent velocity measurement capability. However, existing 4D radar-based 3D detectors rely heavily on pillar encoders for BEV feature extraction, where each point contributes to only a single BEV grid, resulting in sparse feature maps and degraded representation quality. In addition, they also optimize bounding box attributes independently, leading to sub-optimal detection accuracy. Moreover, their inference speed, while sufficient for high-end GPUs, may fail to meet the real-time requirement on vehicle-mounted embedded devices. To overcome these limitations, an efficient and effective Gaussian-based 3D detector, namely RadarGaussianDet3D is introduced, leveraging Gaussian primitives and distributions as intermediate representations for radar points and bounding boxes. In RadarGaussianDet3D, a novel Point Gaussian Encoder (PGE) is designed to transform each point into a Gaussian primitive after feature aggregation and employs the 3D Gaussian Splatting (3DGS) technique for BEV rasterization, yielding denser feature maps. PGE exhibits exceptionally low latency, owing to the optimized algorithm for point feature aggregation and fast rendering of 3DGS. In addition, a new Box Gaussian Loss (BGL) is proposed, which converts bounding boxes into 3D Gaussian distributions and measures their distance to enable more comprehensive and consistent optimization. Extensive experiments on TJ4DRadSet and View-of-Delft demonstrate that RadarGaussianDet3D achieves high detection accuracy while delivering substantially faster inference, highlighting its potential for real-time deployment in autonomous driving.

RadarGaussianDet3D: Gaussian Representation-based Real-time 3D Object Detection with 4D Automotive Radars

Abstract

4D automotive radars have gained increasing attention for autonomous driving due to their low cost, robustness, and inherent velocity measurement capability. However, existing 4D radar-based 3D detectors rely heavily on pillar encoders for BEV feature extraction, where each point contributes to only a single BEV grid, resulting in sparse feature maps and degraded representation quality. In addition, they also optimize bounding box attributes independently, leading to sub-optimal detection accuracy. Moreover, their inference speed, while sufficient for high-end GPUs, may fail to meet the real-time requirement on vehicle-mounted embedded devices. To overcome these limitations, an efficient and effective Gaussian-based 3D detector, namely RadarGaussianDet3D is introduced, leveraging Gaussian primitives and distributions as intermediate representations for radar points and bounding boxes. In RadarGaussianDet3D, a novel Point Gaussian Encoder (PGE) is designed to transform each point into a Gaussian primitive after feature aggregation and employs the 3D Gaussian Splatting (3DGS) technique for BEV rasterization, yielding denser feature maps. PGE exhibits exceptionally low latency, owing to the optimized algorithm for point feature aggregation and fast rendering of 3DGS. In addition, a new Box Gaussian Loss (BGL) is proposed, which converts bounding boxes into 3D Gaussian distributions and measures their distance to enable more comprehensive and consistent optimization. Extensive experiments on TJ4DRadSet and View-of-Delft demonstrate that RadarGaussianDet3D achieves high detection accuracy while delivering substantially faster inference, highlighting its potential for real-time deployment in autonomous driving.

Paper Structure

This paper contains 13 sections, 11 equations, 5 figures, 5 tables.

Figures (5)

  • Figure 1: Illustration of different point scattering methods. Purple arrows from a point indicate the BEV grid cells influenced by that point. (a) Pillar scatter in PointPillars PointPillars maps each point to a single grid cell based on coordinates. (b) Enhanced scatter methods, such as the RCS-aware scatter in RCBEVDet RCBEVDet, define a neighborhood for each point and assign it to all grid cells whose centers lie within the neighborhood. (c) Gaussian splatting in the proposed RadarGaussianDet3D rasterizes point-converted Gaussian primitives onto the BEV plane, allowing each point to contribute to all overlapping grid cells.
  • Figure 2: Overview of RadarGaussianDet3D.
  • Figure 3: Visualization of different LFA implementations. For simplicity, concatenation with position offsets is omitted.
  • Figure 4: Visualization results of RadarGaussianDet3D on the TJ4DRadSet TJ4DRadSettest set (left) and VoD val set (right). Gray points denote 4D radar points in BEV, orange and blue boxes indicate ground-truth and predicted bounding boxes, respectively, and the red triangle marks the ego-vehicle position.
  • Figure 5: Visualization of BEV feature maps from CenterPoint-Pillar CenterPoint (left) and RadarGaussianDet3D without BGL (right) on the TJ4DRadSet test set. The first row presents initial BEV feature map after the pillar encoder or PGE, the second row displays the BEV feature map output by the backbone and neck, and the last row shows the radar point cloud and final detection results.