Table of Contents
Fetching ...

Adaptive Communications in Collaborative Perception with Domain Alignment for Autonomous Driving

Senkang Hu, Zhengru Fang, Haonan An, Guowen Xu, Yuan Zhou, Xianhao Chen, Yuguang Fang

TL;DR

ACC-DA, a channel-aware collaborative perception framework to dynamically adjust the communication graph to minimize the average transmission delay while mitigating the impacts caused by data heterogeneity, is proposed.

Abstract

Collaborative perception among multiple connected and autonomous vehicles can greatly enhance perceptive capabilities by allowing vehicles to exchange supplementary information via communications. Despite advances in previous approaches, challenges still remain due to channel variations and data heterogeneity among collaborative vehicles. To address these issues, we propose ACC-DA, a channel-aware collaborative perception framework to dynamically adjust the communication graph and minimize the average transmission delay while mitigating the side effects from the data heterogeneity. Our novelties lie in three aspects. We first design a transmission delay minimization method, which can construct the communication graph and minimize the transmission delay according to different channel information state. We then propose an adaptive data reconstruction mechanism, which can dynamically adjust the rate-distortion trade-off to enhance perception efficiency. Moreover, it minimizes the temporal redundancy during data transmissions. Finally, we conceive a domain alignment scheme to align the data distribution from different vehicles, which can mitigate the domain gap between different vehicles and improve the performance of the target task. Comprehensive experiments demonstrate the effectiveness of our method in comparison to the existing state-of-the-art works.

Adaptive Communications in Collaborative Perception with Domain Alignment for Autonomous Driving

TL;DR

ACC-DA, a channel-aware collaborative perception framework to dynamically adjust the communication graph to minimize the average transmission delay while mitigating the impacts caused by data heterogeneity, is proposed.

Abstract

Collaborative perception among multiple connected and autonomous vehicles can greatly enhance perceptive capabilities by allowing vehicles to exchange supplementary information via communications. Despite advances in previous approaches, challenges still remain due to channel variations and data heterogeneity among collaborative vehicles. To address these issues, we propose ACC-DA, a channel-aware collaborative perception framework to dynamically adjust the communication graph and minimize the average transmission delay while mitigating the side effects from the data heterogeneity. Our novelties lie in three aspects. We first design a transmission delay minimization method, which can construct the communication graph and minimize the transmission delay according to different channel information state. We then propose an adaptive data reconstruction mechanism, which can dynamically adjust the rate-distortion trade-off to enhance perception efficiency. Moreover, it minimizes the temporal redundancy during data transmissions. Finally, we conceive a domain alignment scheme to align the data distribution from different vehicles, which can mitigate the domain gap between different vehicles and improve the performance of the target task. Comprehensive experiments demonstrate the effectiveness of our method in comparison to the existing state-of-the-art works.
Paper Structure (13 sections, 11 equations, 6 figures, 2 tables)

This paper contains 13 sections, 11 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Overview architecture of our proposed ACC-DA framework. First, we minimize the average transmission delay and construct communication graph. Second, CAVs transmit a small portion of raw images to roadside unit to refine the data reconstruction and update the parameters of the encoder and decoder to reduce the temporal redundancy in the data. Meanwhile, CAVs use their encoders to convert images into a bit stream, which is then transmitted to the ego CAV. Third, the ego CAV decodes the received bit stream and aligns the reconstructed images to the domain where its own perceived image in, and then these aligned data are fused together via a fusion net adopted from CoBEVT xuCoBEVTCooperativeBird2022 to obtain bird's eye view (BEV) prediction.
  • Figure 2: Domain Alignment (DA) Mechanism.
  • Figure 3: Visualization of the BEV segmentation results from the OPV2V dataset, figure (a) is the Groundtruth, (b) is generated from the No Fusion scheme, (c) is from V2VNet, (d) is from the Attention Fusion. Compared with other methods, our ACC-DA method demonstrates robust performance under different traffic situations, which can achieve more accurate results.
  • Figure 4: Effect of the Network Optimization "w/" means with network optimization, "w/o" means without network optimization
  • Figure 5: Effect of the model refinement. "w/" means with refinement, "w/o" means without refinement
  • ...and 1 more figures