RICCARDO: Radar Hit Prediction and Convolution for Camera-Radar 3D Object Detection

Yunfei Long; Abhinav Kumar; Xiaoming Liu; Daniel Morris

RICCARDO: Radar Hit Prediction and Convolution for Camera-Radar 3D Object Detection

Yunfei Long, Abhinav Kumar, Xiaoming Liu, Daniel Morris

TL;DR

This work addresses the core challenge of fuseable depth and pose information in camera-radar 3D detection by explicitly modeling radar hit distributions conditioned on object properties. The authors introduce RICCARDO, a three-stage pipeline: Stage 1 predicts an object-centered radar hit distribution (RIC) in BEV; Stage 2 convolving this distribution with accumulated radar points yields radial matching scores; Stage 3 refines candidates by integrating monocular cues and Stage-2 evidence to produce a final range estimate and score. The approach achieves state-of-the-art radar-camera fusion performance on nuScenes, demonstrating improved range estimation and robust fusion across categories, with ablations validating the value of learned radar distributions over baselines. The method is lightweight, modular, and capable of benefiting from different monocular detectors, suggesting practical impact for improving depth and pose estimation in autonomous driving with low-cost sensors.

Abstract

Radar hits reflect from points on both the boundary and internal to object outlines. This results in a complex distribution of radar hits that depends on factors including object category, size, and orientation. Current radar-camera fusion methods implicitly account for this with a black-box neural network. In this paper, we explicitly utilize a radar hit distribution model to assist fusion. First, we build a model to predict radar hit distributions conditioned on object properties obtained from a monocular detector. Second, we use the predicted distribution as a kernel to match actual measured radar points in the neighborhood of the monocular detections, generating matching scores at nearby positions. Finally, a fusion stage combines context with the kernel detector to refine the matching scores. Our method achieves the state-of-the-art radar-camera detection performance on nuScenes. Our source code is available at https://github.com/longyunf/riccardo.

RICCARDO: Radar Hit Prediction and Convolution for Camera-Radar 3D Object Detection

TL;DR

Abstract

RICCARDO: Radar Hit Prediction and Convolution for Camera-Radar 3D Object Detection

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (11)