Enhanced 3D Object Detection via Diverse Feature Representations of 4D Radar Tensor

Seung-Hyun Song; Dong-Hee Paek; Minh-Quan Dao; Ezio Malis; Seung-Hyun Kong

Enhanced 3D Object Detection via Diverse Feature Representations of 4D Radar Tensor

Seung-Hyun Song, Dong-Hee Paek, Minh-Quan Dao, Ezio Malis, Seung-Hyun Kong

TL;DR

This work tackles robust 3D object detection using 4D Radar by addressing the variability introduced by diverse radar preprocessing. It introduces 4D Radar Multi-Representation (4DR-MR), a multi-teacher knowledge distillation framework where teachers learn from different 4DRT pre-processing pipelines and a fusion-then-distillation mechanism transfers their rich representations to a lightweight student that operates on sparse radar inputs. Key contributions include the aggregation module comprising a dedicated representation-alignment stage and an attention-based fusion stage, plus a densify module to bridge the density gap between teacher and student features; combined with a balanced loss for detection and distillation. On the K-Radar dataset, 4DR-MR achieves notable gains over RTNH baselines with extremely sparse inputs and remains competitive with denser-input methods, while dramatically reducing input data size and preserving runtime efficiency. These results demonstrate the practical viability of leveraging diverse 4DRT representations to improve radar-based perception in resource-constrained autonomous systems.

Abstract

Recent advances in automotive four-dimensional (4D) Radar have enabled access to raw 4D Radar Tensor (4DRT), offering richer spatial and Doppler information than conventional point clouds. While most existing methods rely on heavily pre-processed, sparse Radar data, recent attempts to leverage raw 4DRT face high computational costs and limited scalability. To address these limitations, we propose a novel three-dimensional (3D) object detection framework that maximizes the utility of 4DRT while preserving efficiency. Our method introduces a multi-teacher knowledge distillation (KD), where multiple teacher models are trained on point clouds derived from diverse 4DRT pre-processing techniques, each capturing complementary signal characteristics. These teacher representations are fused via a dedicated aggregation module and distilled into a lightweight student model that operates solely on a sparse Radar input. Experimental results on the K-Radar dataset demonstrate that our framework achieves improvements of 7.3% in AP_3D and 9.5% in AP_BEV over the baseline RTNH model when using extremely sparse inputs. Furthermore, it attains comparable performance to denser-input baselines while significantly reducing the input data size by about 90 times, confirming the scalability and efficiency of our approach.

Enhanced 3D Object Detection via Diverse Feature Representations of 4D Radar Tensor

TL;DR

Abstract

Enhanced 3D Object Detection via Diverse Feature Representations of 4D Radar Tensor

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (5)