Table of Contents
Fetching ...

LiRCDepth: Lightweight Radar-Camera Depth Estimation via Knowledge Distillation and Uncertainty Guidance

Huawei Sun, Nastassia Vysotskaya, Tobias Sukianto, Hao Feng, Julius Ott, Xiangyuan Peng, Lorenzo Servadei, Robert Wille

TL;DR

LiRCDepth delivers a lightweight radar-camera depth estimation solution by distilling knowledge from a heavy teacher (CaFNet) into a MobileNetV2-based student. It introduces three distillation streams—single-modal feature distillation, structure-guided decoding distillation, and uncertainty-aware inter-depth distillation—plus an uncertainty-rectified depth loss that leverages both dense and single-scan LiDAR supervision. The approach achieves ~80% fewer parameters with competitive accuracy, notably a $6.6\%$ improvement in $MAE$ on nuScenes over non-distilled training, while reducing FLOPs to about $121$G. This yields a practically deployable fusion model for autonomous driving that maintains depth fidelity under adverse conditions and limited compute.

Abstract

Recently, radar-camera fusion algorithms have gained significant attention as radar sensors provide geometric information that complements the limitations of cameras. However, most existing radar-camera depth estimation algorithms focus solely on improving performance, often neglecting computational efficiency. To address this gap, we propose LiRCDepth, a lightweight radar-camera depth estimation model. We incorporate knowledge distillation to enhance the training process, transferring critical information from a complex teacher model to our lightweight student model in three key domains. Firstly, low-level and high-level features are transferred by incorporating pixel-wise and pair-wise distillation. Additionally, we introduce an uncertainty-aware inter-depth distillation loss to refine intermediate depth maps during decoding. Leveraging our proposed knowledge distillation scheme, the lightweight model achieves a 6.6% improvement in MAE on the nuScenes dataset compared to the model trained without distillation. Code: https://github.com/harborsarah/LiRCDepth

LiRCDepth: Lightweight Radar-Camera Depth Estimation via Knowledge Distillation and Uncertainty Guidance

TL;DR

LiRCDepth delivers a lightweight radar-camera depth estimation solution by distilling knowledge from a heavy teacher (CaFNet) into a MobileNetV2-based student. It introduces three distillation streams—single-modal feature distillation, structure-guided decoding distillation, and uncertainty-aware inter-depth distillation—plus an uncertainty-rectified depth loss that leverages both dense and single-scan LiDAR supervision. The approach achieves ~80% fewer parameters with competitive accuracy, notably a improvement in on nuScenes over non-distilled training, while reducing FLOPs to about G. This yields a practically deployable fusion model for autonomous driving that maintains depth fidelity under adverse conditions and limited compute.

Abstract

Recently, radar-camera fusion algorithms have gained significant attention as radar sensors provide geometric information that complements the limitations of cameras. However, most existing radar-camera depth estimation algorithms focus solely on improving performance, often neglecting computational efficiency. To address this gap, we propose LiRCDepth, a lightweight radar-camera depth estimation model. We incorporate knowledge distillation to enhance the training process, transferring critical information from a complex teacher model to our lightweight student model in three key domains. Firstly, low-level and high-level features are transferred by incorporating pixel-wise and pair-wise distillation. Additionally, we introduce an uncertainty-aware inter-depth distillation loss to refine intermediate depth maps during decoding. Leveraging our proposed knowledge distillation scheme, the lightweight model achieves a 6.6% improvement in MAE on the nuScenes dataset compared to the model trained without distillation. Code: https://github.com/harborsarah/LiRCDepth

Paper Structure

This paper contains 14 sections, 7 equations, 2 figures, 3 tables.

Figures (2)

  • Figure 1: Model Architecture.
  • Figure 2: Qualitative comparison at 80 meters depth range. Colume 2: LiRCDepth(w/o KD). Colume 3: LiRCDepth(KD).