Table of Contents
Fetching ...

4D-ROLLS: 4D Radar Occupancy Learning via LiDAR Supervision

Ruihan Liu, Xiaoyi Wu, Xijun Chen, Liang Hu, Yunjiang Lou

TL;DR

This work tackles the challenge of estimating occupancy in 3D scenes using 4D radar, which is robust in adverse weather but suffers from sparsity and noise. It introduces 4D-ROLLS, a weakly supervised framework that learns radar occupancy by leveraging LiDAR-derived pseudo-labels, including occupancy queries and LiDAR height maps, and uses a TPV-based encoding with a height-constrained loss to align radar outputs with LiDAR occupancy. A two-stage training procedure—initial LiDAR-guided learning followed by fine-tuning with LiDAR self-supervision—yields robust occupancy estimates and enables effective transfer to BEV segmentation and 3D occupancy prediction, even across datasets. The model runs at real-time speeds (≈30 Hz) on consumer-grade GPUs and demonstrates strong generalization, robustness in degraded environments, and practical downstream applicability, making it a promising all-weather perception solution for autonomous systems.

Abstract

A comprehensive understanding of 3D scenes is essential for autonomous vehicles (AVs), and among various perception tasks, occupancy estimation plays a central role by providing a general representation of drivable and occupied space. However, most existing occupancy estimation methods rely on LiDAR or cameras, which perform poorly in degraded environments such as smoke, rain, snow, and fog. In this paper, we propose 4D-ROLLS, the first weakly supervised occupancy estimation method for 4D radar using the LiDAR point cloud as the supervisory signal. Specifically, we introduce a method for generating pseudo-LiDAR labels, including occupancy queries and LiDAR height maps, as multi-stage supervision to train the 4D radar occupancy estimation model. Then the model is aligned with the occupancy map produced by LiDAR, fine-tuning its accuracy in occupancy estimation. Extensive comparative experiments validate the exceptional performance of 4D-ROLLS. Its robustness in degraded environments and effectiveness in cross-dataset training are qualitatively demonstrated. The model is also seamlessly transferred to downstream tasks BEV segmentation and point cloud occupancy prediction, highlighting its potential for broader applications. The lightweight network enables 4D-ROLLS model to achieve fast inference speeds at about 30 Hz on a 4060 GPU. The code of 4D-ROLLS will be made available at https://github.com/CLASS-Lab/4D-ROLLS.

4D-ROLLS: 4D Radar Occupancy Learning via LiDAR Supervision

TL;DR

This work tackles the challenge of estimating occupancy in 3D scenes using 4D radar, which is robust in adverse weather but suffers from sparsity and noise. It introduces 4D-ROLLS, a weakly supervised framework that learns radar occupancy by leveraging LiDAR-derived pseudo-labels, including occupancy queries and LiDAR height maps, and uses a TPV-based encoding with a height-constrained loss to align radar outputs with LiDAR occupancy. A two-stage training procedure—initial LiDAR-guided learning followed by fine-tuning with LiDAR self-supervision—yields robust occupancy estimates and enables effective transfer to BEV segmentation and 3D occupancy prediction, even across datasets. The model runs at real-time speeds (≈30 Hz) on consumer-grade GPUs and demonstrates strong generalization, robustness in degraded environments, and practical downstream applicability, making it a promising all-weather perception solution for autonomous systems.

Abstract

A comprehensive understanding of 3D scenes is essential for autonomous vehicles (AVs), and among various perception tasks, occupancy estimation plays a central role by providing a general representation of drivable and occupied space. However, most existing occupancy estimation methods rely on LiDAR or cameras, which perform poorly in degraded environments such as smoke, rain, snow, and fog. In this paper, we propose 4D-ROLLS, the first weakly supervised occupancy estimation method for 4D radar using the LiDAR point cloud as the supervisory signal. Specifically, we introduce a method for generating pseudo-LiDAR labels, including occupancy queries and LiDAR height maps, as multi-stage supervision to train the 4D radar occupancy estimation model. Then the model is aligned with the occupancy map produced by LiDAR, fine-tuning its accuracy in occupancy estimation. Extensive comparative experiments validate the exceptional performance of 4D-ROLLS. Its robustness in degraded environments and effectiveness in cross-dataset training are qualitatively demonstrated. The model is also seamlessly transferred to downstream tasks BEV segmentation and point cloud occupancy prediction, highlighting its potential for broader applications. The lightweight network enables 4D-ROLLS model to achieve fast inference speeds at about 30 Hz on a 4060 GPU. The code of 4D-ROLLS will be made available at https://github.com/CLASS-Lab/4D-ROLLS.

Paper Structure

This paper contains 14 sections, 4 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: We compare our method with the classic LiDAR-based occupancy estimation approach, ALSO boulch2023also, in foggy scenes. Under normal weather conditions, both methods demonstrate strong performance. However, in degraded environments, such as fog, our method remains robust, whereas ALSO fails due to the significant reduction or even complete absence.
  • Figure 2: Overview of the Framework: including Occupancy Estimation, Prediction Results, and BEV Segmentation Performance
  • Figure 3: Qualitative comparison on the MSC dataset. Despite the sparse 4D radar point cloud, our approach effectively infers fine-grained scene details. Our method achieves promising results in the Stage 1. Through fine-tuning, it further refines the predictions, correcting points that the 4D radar previously failed to capture.
  • Figure 4: Qualitative comparison on NTU dataset, including cross-dataset testing and intra-dataset testing.
  • Figure 5: The results of downstream tasks using our method.