Reliability-Driven LiDAR-Camera Fusion for Robust 3D Object Detection

Reza Sadeghian; Niloofar Hooshyaripour; Chris Joslin; WonSook Lee

Reliability-Driven LiDAR-Camera Fusion for Robust 3D Object Detection

Reza Sadeghian, Niloofar Hooshyaripour, Chris Joslin, WonSook Lee

TL;DR

This paper tackles robust 3D object detection for autonomous driving under sensor malfunctions by proposing ReliFusion, a BEV-based fusion framework that combines Spatio-Temporal Feature Aggregation (STFA), a Cross-Modality Contrastive Learning–driven Reliability Module, and Confidence-Weighted Mutual Cross-Attention (CW-MCA) to adapt fusion to modality confidence. STFA captures both inter-view spatial dependencies and cross-time dynamics, the Reliability Module produces per-modality confidence scores to quantify data reliability, and CW-MCA uses these scores to dynamically balance LiDAR and camera information during fusion. On the nuScenes dataset, ReliFusion achieves state-of-the-art robustness and accuracy, particularly when LiDAR has a limited field of view or is degraded, outperforming BEVFusion and TransFusion in challenging conditions. The approach advances practical autonomous-driving perception by maintaining accurate BEV detections under adverse sensing scenarios and paves the way for further reliability-aware multimodal fusion research.

Abstract

Accurate and robust 3D object detection is essential for autonomous driving, where fusing data from sensors like LiDAR and camera enhances detection accuracy. However, sensor malfunctions such as corruption or disconnection can degrade performance, and existing fusion models often struggle to maintain reliability when one modality fails. To address this, we propose ReliFusion, a novel LiDAR-camera fusion framework operating in the bird's-eye view (BEV) space. ReliFusion integrates three key components: the Spatio-Temporal Feature Aggregation (STFA) module, which captures dependencies across frames to stabilize predictions over time; the Reliability module, which assigns confidence scores to quantify the dependability of each modality under challenging conditions; and the Confidence-Weighted Mutual Cross-Attention (CW-MCA) module, which dynamically balances information from LiDAR and camera modalities based on these confidence scores. Experiments on the nuScenes dataset show that ReliFusion significantly outperforms state-of-the-art methods, achieving superior robustness and accuracy in scenarios with limited LiDAR fields of view and severe sensor malfunctions.

Reliability-Driven LiDAR-Camera Fusion for Robust 3D Object Detection

TL;DR

Abstract

Reliability-Driven LiDAR-Camera Fusion for Robust 3D Object Detection

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (3)