Table of Contents
Fetching ...

DSRC: Learning Density-insensitive and Semantic-aware Collaborative Representation against Corruptions

Jingyu Zhang, Yilei Wang, Lang Qian, Peng Sun, Zengwen Li, Sudong Jiang, Maolin Liu, Liang Song

TL;DR

This work tackles the robustness gap of multi-agent collaborative perception under real-world corruptions by introducing DSRC, a density-insensitive and semantic-aware distillation framework. DSRC utilizes a teacher–student architecture where a dense, multi-view teacher guides a lightweight student through three-stage distillation and an auxiliary voxel-level point cloud reconstruction module, enabling robust cross-agent fusion. Evaluations on OPV2V and DAIR-V2X show DSRC outperforms state-of-the-art methods in both clean and corrupted conditions, with ablations confirming the contributions of sparse-to-dense distillation, semantic painting, and reconstruction. The approach offers practical improvements for reliable V2X perception and provides code for reproducibility.

Abstract

As a potential application of Vehicle-to-Everything (V2X) communication, multi-agent collaborative perception has achieved significant success in 3D object detection. While these methods have demonstrated impressive results on standard benchmarks, the robustness of such approaches in the face of complex real-world environments requires additional verification. To bridge this gap, we introduce the first comprehensive benchmark designed to evaluate the robustness of collaborative perception methods in the presence of natural corruptions typical of real-world environments. Furthermore, we propose DSRC, a robustness-enhanced collaborative perception method aiming to learn Density-insensitive and Semantic-aware collaborative Representation against Corruptions. DSRC consists of two key designs: i) a semantic-guided sparse-to-dense distillation framework, which constructs multi-view dense objects painted by ground truth bounding boxes to effectively learn density-insensitive and semantic-aware collaborative representation; ii) a feature-to-point cloud reconstruction approach to better fuse critical collaborative representation across agents. To thoroughly evaluate DSRC, we conduct extensive experiments on real-world and simulated datasets. The results demonstrate that our method outperforms SOTA collaborative perception methods in both clean and corrupted conditions. Code is available at https://github.com/Terry9a/DSRC.

DSRC: Learning Density-insensitive and Semantic-aware Collaborative Representation against Corruptions

TL;DR

This work tackles the robustness gap of multi-agent collaborative perception under real-world corruptions by introducing DSRC, a density-insensitive and semantic-aware distillation framework. DSRC utilizes a teacher–student architecture where a dense, multi-view teacher guides a lightweight student through three-stage distillation and an auxiliary voxel-level point cloud reconstruction module, enabling robust cross-agent fusion. Evaluations on OPV2V and DAIR-V2X show DSRC outperforms state-of-the-art methods in both clean and corrupted conditions, with ablations confirming the contributions of sparse-to-dense distillation, semantic painting, and reconstruction. The approach offers practical improvements for reliable V2X perception and provides code for reproducibility.

Abstract

As a potential application of Vehicle-to-Everything (V2X) communication, multi-agent collaborative perception has achieved significant success in 3D object detection. While these methods have demonstrated impressive results on standard benchmarks, the robustness of such approaches in the face of complex real-world environments requires additional verification. To bridge this gap, we introduce the first comprehensive benchmark designed to evaluate the robustness of collaborative perception methods in the presence of natural corruptions typical of real-world environments. Furthermore, we propose DSRC, a robustness-enhanced collaborative perception method aiming to learn Density-insensitive and Semantic-aware collaborative Representation against Corruptions. DSRC consists of two key designs: i) a semantic-guided sparse-to-dense distillation framework, which constructs multi-view dense objects painted by ground truth bounding boxes to effectively learn density-insensitive and semantic-aware collaborative representation; ii) a feature-to-point cloud reconstruction approach to better fuse critical collaborative representation across agents. To thoroughly evaluate DSRC, we conduct extensive experiments on real-world and simulated datasets. The results demonstrate that our method outperforms SOTA collaborative perception methods in both clean and corrupted conditions. Code is available at https://github.com/Terry9a/DSRC.

Paper Structure

This paper contains 25 sections, 11 equations, 8 figures, 4 tables.

Figures (8)

  • Figure 1: Visualization of typical corruption types in our benchmark. Point cloud from different agents are shown in different colors.
  • Figure 2: The overall architecture of the proposed framework DSRC. It contains two branches with identical network structures: Student (Bottom) and Teacher (Top). The framework employs a three-stage distillation strategy: distillation after encoding (DAE), distillation after fusion (DAF), and distillation after prediction (DAP) to achieve effective knowledge transfer and a point cloud reconstruction module to better fuse crucial collaborative representation across agents. During inference, the teacher model and point cloud reconstruction are discarded; only the student model (blue data flow) is retained.
  • Figure 3: Illustration of the proposed point cloud reconstruction module. It provides additional supervision to better fuse critical collaborative representation across agents.
  • Figure 4: The average performance of all models under different corruption types.
  • Figure 5: Benchmarking results of all models on the six robustness sets. The figure shows the mean corruption error (mCE) vs. the mean average precision (mAP).
  • ...and 3 more figures