Boosting Online 3D Multi-Object Tracking through Camera-Radar Cross Check
Sheng-Yao Kuan, Jen-Hao Cheng, Hsiang-Wei Huang, Wenhao Chai, Cheng-Yen Yang, Hugo Latapie, Gaowen Liu, Bing-Fei Wu, Jenq-Neng Hwang
TL;DR
CRAFTBooster tackles the challenge of surpassing single-modality detectors for 3D MOT by introducing an online, cross-modality fusion framework that runs tracking-enhanced fusion between camera and radar. It decomposes the system into three modules—Inner-modality Matching, Cross-modality Check, and Multi-modality Fusion—exploiting perspective-view camera detections and BEV radar detections to recover missed tracklets and fuse observations. Empirically, it yields about 5-6% IDF1 gains on K-Radar and 1-2% on CRUW3D, demonstrating robustness across diverse weather conditions and compatibility with existing online trackers. The work highlights the practical potential of dedicated tracking-stage fusion to advance reliable 3D MOT in autonomous driving.
Abstract
In the domain of autonomous driving, the integration of multi-modal perception techniques based on data from diverse sensors has demonstrated substantial progress. Effectively surpassing the capabilities of state-of-the-art single-modality detectors through sensor fusion remains an active challenge. This work leverages the respective advantages of cameras in perspective view and radars in Bird's Eye View (BEV) to greatly enhance overall detection and tracking performance. Our approach, Camera-Radar Associated Fusion Tracking Booster (CRAFTBooster), represents a pioneering effort to enhance radar-camera fusion in the tracking stage, contributing to improved 3D MOT accuracy. The superior experimental results on the K-Radaar dataset, which exhibit 5-6% on IDF1 tracking performance gain, validate the potential of effective sensor fusion in advancing autonomous driving.
