Table of Contents
Fetching ...

RoCo:Robust Collaborative Perception By Iterative Object Matching and Pose Adjustment

Zhe Huang, Shuo Wang, Yongcai Wang, Wanting Li, Deying Li, Lei Wang

TL;DR

This work is the first to model the pose correction problem in collaborative perception as an object matching task, which reliably associates common objects detected by different agents with highly desired robustness when the pose information of agents is with high-level noise.

Abstract

Collaborative autonomous driving with multiple vehicles usually requires the data fusion from multiple modalities. To ensure effective fusion, the data from each individual modality shall maintain a reasonably high quality. However, in collaborative perception, the quality of object detection based on a modality is highly sensitive to the relative pose errors among the agents. It leads to feature misalignment and significantly reduces collaborative performance. To address this issue, we propose RoCo, a novel unsupervised framework to conduct iterative object matching and agent pose adjustment. To the best of our knowledge, our work is the first to model the pose correction problem in collaborative perception as an object matching task, which reliably associates common objects detected by different agents. On top of this, we propose a graph optimization process to adjust the agent poses by minimizing the alignment errors of the associated objects, and the object matching is re-done based on the adjusted agent poses. This process is carried out iteratively until convergence. Experimental study on both simulated and real-world datasets demonstrates that the proposed framework RoCo consistently outperforms existing relevant methods in terms of the collaborative object detection performance, and exhibits highly desired robustness when the pose information of agents is with high-level noise. Ablation studies are also provided to show the impact of its key parameters and components. The code is released at https://github.com/HuangZhe885/RoCo.

RoCo:Robust Collaborative Perception By Iterative Object Matching and Pose Adjustment

TL;DR

This work is the first to model the pose correction problem in collaborative perception as an object matching task, which reliably associates common objects detected by different agents with highly desired robustness when the pose information of agents is with high-level noise.

Abstract

Collaborative autonomous driving with multiple vehicles usually requires the data fusion from multiple modalities. To ensure effective fusion, the data from each individual modality shall maintain a reasonably high quality. However, in collaborative perception, the quality of object detection based on a modality is highly sensitive to the relative pose errors among the agents. It leads to feature misalignment and significantly reduces collaborative performance. To address this issue, we propose RoCo, a novel unsupervised framework to conduct iterative object matching and agent pose adjustment. To the best of our knowledge, our work is the first to model the pose correction problem in collaborative perception as an object matching task, which reliably associates common objects detected by different agents. On top of this, we propose a graph optimization process to adjust the agent poses by minimizing the alignment errors of the associated objects, and the object matching is re-done based on the adjusted agent poses. This process is carried out iteratively until convergence. Experimental study on both simulated and real-world datasets demonstrates that the proposed framework RoCo consistently outperforms existing relevant methods in terms of the collaborative object detection performance, and exhibits highly desired robustness when the pose information of agents is with high-level noise. Ablation studies are also provided to show the impact of its key parameters and components. The code is released at https://github.com/HuangZhe885/RoCo.
Paper Structure (20 sections, 10 equations, 7 figures, 4 tables)

This paper contains 20 sections, 10 equations, 7 figures, 4 tables.

Figures (7)

  • Figure 1: Illustration of robust collaborative perception system and the result with or without the proposed RoCo.
  • Figure 2: Overview of RoCo system. The object bounding boxes and poses are transmitted as messages to other agents to achieve object matching and robust graph optimization, resulting in corrected matching and poses. Features are transformed based on the corrected poses in the ego coordinate system and fused across all agents.
  • Figure 3: Object Matching and Pose graph illustration.
  • Figure 4: How the residue errors are set up.
  • Figure 5: Visualization of detection results for V2VNet, V2X-ViT, CoAlign and our RoCo with the noisy level $\sigma^2_{t} /\sigma^2_{r} \left( m/^{\circ }\right)$ of 0.4/0.4 (the first row), and 0.8/0.8 (the second row) on V2XSet dateset. An intersection scenario is given. RoCo qualitatively outperforms the others under different noisy level.
  • ...and 2 more figures