Table of Contents
Fetching ...

RadarDistill: Boosting Radar-based Object Detection Performance via Knowledge Distillation from LiDAR Features

Geonho Bang, Kwangjin Choi, Jisong Kim, Dongsuk Kum, Jun Won Choi

TL;DR

RadarDistill tackles the challenge of noisy, sparse radar data for 3D object detection by transferring richly structured LiDAR representations into the radar domain through a three-component KD framework. CMA densifies radar BEV features to enable denser cross-modality transfer, while Activation-based (AFD) and Proposal-based (PFD) Distillation selectively align low- and high-level features in informative regions and object proposals. The method, trained with LiDAR supervision only and tests using a radar-equipped PillarNet baseline, achieves state-of-the-art radar-only performance on nuScenes and provides notable gains in radar-camera fusion. This approach offers a practical path to leverage abundant LiDAR-like semantics during training to improve radar-based perception in adverse conditions.

Abstract

The inherent noisy and sparse characteristics of radar data pose challenges in finding effective representations for 3D object detection. In this paper, we propose RadarDistill, a novel knowledge distillation (KD) method, which can improve the representation of radar data by leveraging LiDAR data. RadarDistill successfully transfers desirable characteristics of LiDAR features into radar features using three key components: Cross-Modality Alignment (CMA), Activation-based Feature Distillation (AFD), and Proposal-based Feature Distillation (PFD). CMA enhances the density of radar features by employing multiple layers of dilation operations, effectively addressing the challenge of inefficient knowledge transfer from LiDAR to radar. AFD selectively transfers knowledge based on regions of the LiDAR features, with a specific focus on areas where activation intensity exceeds a predefined threshold. PFD similarly guides the radar network to selectively mimic features from the LiDAR network within the object proposals. Our comparative analyses conducted on the nuScenes datasets demonstrate that RadarDistill achieves state-of-the-art (SOTA) performance for radar-only object detection task, recording 20.5% in mAP and 43.7% in NDS. Also, RadarDistill significantly improves the performance of the camera-radar fusion model.

RadarDistill: Boosting Radar-based Object Detection Performance via Knowledge Distillation from LiDAR Features

TL;DR

RadarDistill tackles the challenge of noisy, sparse radar data for 3D object detection by transferring richly structured LiDAR representations into the radar domain through a three-component KD framework. CMA densifies radar BEV features to enable denser cross-modality transfer, while Activation-based (AFD) and Proposal-based (PFD) Distillation selectively align low- and high-level features in informative regions and object proposals. The method, trained with LiDAR supervision only and tests using a radar-equipped PillarNet baseline, achieves state-of-the-art radar-only performance on nuScenes and provides notable gains in radar-camera fusion. This approach offers a practical path to leverage abundant LiDAR-like semantics during training to improve radar-based perception in adverse conditions.

Abstract

The inherent noisy and sparse characteristics of radar data pose challenges in finding effective representations for 3D object detection. In this paper, we propose RadarDistill, a novel knowledge distillation (KD) method, which can improve the representation of radar data by leveraging LiDAR data. RadarDistill successfully transfers desirable characteristics of LiDAR features into radar features using three key components: Cross-Modality Alignment (CMA), Activation-based Feature Distillation (AFD), and Proposal-based Feature Distillation (PFD). CMA enhances the density of radar features by employing multiple layers of dilation operations, effectively addressing the challenge of inefficient knowledge transfer from LiDAR to radar. AFD selectively transfers knowledge based on regions of the LiDAR features, with a specific focus on areas where activation intensity exceeds a predefined threshold. PFD similarly guides the radar network to selectively mimic features from the LiDAR network within the object proposals. Our comparative analyses conducted on the nuScenes datasets demonstrate that RadarDistill achieves state-of-the-art (SOTA) performance for radar-only object detection task, recording 20.5% in mAP and 43.7% in NDS. Also, RadarDistill significantly improves the performance of the camera-radar fusion model.
Paper Structure (16 sections, 14 equations, 3 figures, 7 tables)

This paper contains 16 sections, 14 equations, 3 figures, 7 tables.

Figures (3)

  • Figure 1: Illustration of Proposed RadarDistill. Our RadarDistill method facilitates knowledge transfer from LiDAR features to radar features, enhancing the quality of radar features for Bird's Eye View (BEV) object detection.
  • Figure 2: Overall architecture of RadarDistill. The input point clouds from each modality are independently processed through Pillar Encoding followed by SparseEnc to extract low-level BEV features. CMA is then employed to densify the low-level BEV features in the radar branch. AFD then identifies active and inactive regions based on both radar and LiDAR features and minimizes their associated distillation losses. Subsequently, PFD conducts knowledge distillation based on proposal-level features obtained from DenseEnc. Note that the LiDAR branch is solely utilized during the training phase to enhance the radar pipeline and is not required during inference.
  • Figure 3: Detailed structure of the proposed CMA module.