Table of Contents
Fetching ...

Depth-aware Fusion Method based on Image and 4D Radar Spectrum for 3D Object Detection

Yue Sun, Yeqiang Qian, Chunxiang Wang, Ming Yang

TL;DR

The paper addresses robust 3D object detection for autonomous driving under adverse weather by fusing depth-aware camera images with 4D millimeter-wave radar spectra. It introduces a BEV fusion framework with polar-aligned attention that combines depth-enriched image features and radar spectral features, and a GAN-based depth generator to synthesize depth maps from radar spectra when depth sensors are unavailable. It employs multi-scale feature extraction, a compact detection head with Hungarian loss, and demonstrates improvements on the K-Radar dataset, outperforming radar-point-cloud baselines while reducing network complexity. The work advances all-weather perception by leveraging complementary sensing modalities in a cost-effective, robust pipeline, with future directions in radar data preprocessing and improved depth generation.

Abstract

Safety and reliability are crucial for the public acceptance of autonomous driving. To ensure accurate and reliable environmental perception, intelligent vehicles must exhibit accuracy and robustness in various environments. Millimeter-wave radar, known for its high penetration capability, can operate effectively in adverse weather conditions such as rain, snow, and fog. Traditional 3D millimeter-wave radars can only provide range, Doppler, and azimuth information for objects. Although the recent emergence of 4D millimeter-wave radars has added elevation resolution, the radar point clouds remain sparse due to Constant False Alarm Rate (CFAR) operations. In contrast, cameras offer rich semantic details but are sensitive to lighting and weather conditions. Hence, this paper leverages these two highly complementary and cost-effective sensors, 4D millimeter-wave radar and camera. By integrating 4D radar spectra with depth-aware camera images and employing attention mechanisms, we fuse texture-rich images with depth-rich radar data in the Bird's Eye View (BEV) perspective, enhancing 3D object detection. Additionally, we propose using GAN-based networks to generate depth images from radar spectra in the absence of depth sensors, further improving detection accuracy.

Depth-aware Fusion Method based on Image and 4D Radar Spectrum for 3D Object Detection

TL;DR

The paper addresses robust 3D object detection for autonomous driving under adverse weather by fusing depth-aware camera images with 4D millimeter-wave radar spectra. It introduces a BEV fusion framework with polar-aligned attention that combines depth-enriched image features and radar spectral features, and a GAN-based depth generator to synthesize depth maps from radar spectra when depth sensors are unavailable. It employs multi-scale feature extraction, a compact detection head with Hungarian loss, and demonstrates improvements on the K-Radar dataset, outperforming radar-point-cloud baselines while reducing network complexity. The work advances all-weather perception by leveraging complementary sensing modalities in a cost-effective, robust pipeline, with future directions in radar data preprocessing and improved depth generation.

Abstract

Safety and reliability are crucial for the public acceptance of autonomous driving. To ensure accurate and reliable environmental perception, intelligent vehicles must exhibit accuracy and robustness in various environments. Millimeter-wave radar, known for its high penetration capability, can operate effectively in adverse weather conditions such as rain, snow, and fog. Traditional 3D millimeter-wave radars can only provide range, Doppler, and azimuth information for objects. Although the recent emergence of 4D millimeter-wave radars has added elevation resolution, the radar point clouds remain sparse due to Constant False Alarm Rate (CFAR) operations. In contrast, cameras offer rich semantic details but are sensitive to lighting and weather conditions. Hence, this paper leverages these two highly complementary and cost-effective sensors, 4D millimeter-wave radar and camera. By integrating 4D radar spectra with depth-aware camera images and employing attention mechanisms, we fuse texture-rich images with depth-rich radar data in the Bird's Eye View (BEV) perspective, enhancing 3D object detection. Additionally, we propose using GAN-based networks to generate depth images from radar spectra in the absence of depth sensors, further improving detection accuracy.

Paper Structure

This paper contains 17 sections, 6 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: Traditional signal processing pipeline of 4D millimeter-wave radar data and the corresponding data format after each step 4dsurvey.
  • Figure 2: The overall architecture of our proposed algorithm.
  • Figure 3: The network architecture of our depth image generator based on the design proposed in 3drimr.
  • Figure 4: Visualization of outputs from different methods compared to ground truth.