Table of Contents
Fetching ...

Interruption-Aware Cooperative Perception for V2X Communication-Aided Autonomous Driving

Shunli Ren, Zixing Lei, Zi Wang, Mehrdad Dianati, Yafei Wang, Siheng Chen, Wenjun Zhang

TL;DR

This work tackles the practical problem of V2X communication interruptions in cooperative perception for autonomous driving. It introduces V2X-INCOP, a system that recovers missing cooperative information by leveraging history through a communication-adaptive multi-scale spatial-temporal predictor, augmented with knowledge distillation from an interruption-free oracle and curriculum learning for stable training. The approach uses a SAFF-based history fusion and a multi-scale predictor to generate current estimates of missing features, enabling pseudo-cooperation and robust late-fusion. Extensive experiments on V2X-Sim, OPV2V, and DAIR-V2X show consistent improvements over state-of-the-art methods across various packet drop rates, including significant gains in average precision and demonstrated resilience to pose noise, indicating strong practical potential for interruption-aware cooperative perception.

Abstract

Cooperative perception can significantly improve the perception performance of autonomous vehicles beyond the limited perception ability of individual vehicles by exchanging information with neighbor agents through V2X communication. However, most existing work assume ideal communication among agents, ignoring the significant and common \textit{interruption issues} caused by imperfect V2X communication, where cooperation agents can not receive cooperative messages successfully and thus fail to achieve cooperative perception, leading to safety risks. To fully reap the benefits of cooperative perception in practice, we propose V2X communication INterruption-aware COoperative Perception (V2X-INCOP), a cooperative perception system robust to communication interruption for V2X communication-aided autonomous driving, which leverages historical cooperation information to recover missing information due to the interruptions and alleviate the impact of the interruption issue. To achieve comprehensive recovery, we design a communication-adaptive multi-scale spatial-temporal prediction model to extract multi-scale spatial-temporal features based on V2X communication conditions and capture the most significant information for the prediction of the missing information. To further improve recovery performance, we adopt a knowledge distillation framework to give explicit and direct supervision to the prediction model and a curriculum learning strategy to stabilize the training of the model. Experiments on three public cooperative perception datasets demonstrate that the proposed method is effective in alleviating the impacts of communication interruption on cooperative perception.

Interruption-Aware Cooperative Perception for V2X Communication-Aided Autonomous Driving

TL;DR

This work tackles the practical problem of V2X communication interruptions in cooperative perception for autonomous driving. It introduces V2X-INCOP, a system that recovers missing cooperative information by leveraging history through a communication-adaptive multi-scale spatial-temporal predictor, augmented with knowledge distillation from an interruption-free oracle and curriculum learning for stable training. The approach uses a SAFF-based history fusion and a multi-scale predictor to generate current estimates of missing features, enabling pseudo-cooperation and robust late-fusion. Extensive experiments on V2X-Sim, OPV2V, and DAIR-V2X show consistent improvements over state-of-the-art methods across various packet drop rates, including significant gains in average precision and demonstrated resilience to pose noise, indicating strong practical potential for interruption-aware cooperative perception.

Abstract

Cooperative perception can significantly improve the perception performance of autonomous vehicles beyond the limited perception ability of individual vehicles by exchanging information with neighbor agents through V2X communication. However, most existing work assume ideal communication among agents, ignoring the significant and common \textit{interruption issues} caused by imperfect V2X communication, where cooperation agents can not receive cooperative messages successfully and thus fail to achieve cooperative perception, leading to safety risks. To fully reap the benefits of cooperative perception in practice, we propose V2X communication INterruption-aware COoperative Perception (V2X-INCOP), a cooperative perception system robust to communication interruption for V2X communication-aided autonomous driving, which leverages historical cooperation information to recover missing information due to the interruptions and alleviate the impact of the interruption issue. To achieve comprehensive recovery, we design a communication-adaptive multi-scale spatial-temporal prediction model to extract multi-scale spatial-temporal features based on V2X communication conditions and capture the most significant information for the prediction of the missing information. To further improve recovery performance, we adopt a knowledge distillation framework to give explicit and direct supervision to the prediction model and a curriculum learning strategy to stabilize the training of the model. Experiments on three public cooperative perception datasets demonstrate that the proposed method is effective in alleviating the impacts of communication interruption on cooperative perception.
Paper Structure (30 sections, 17 equations, 13 figures, 1 table)

This paper contains 30 sections, 17 equations, 13 figures, 1 table.

Figures (13)

  • Figure 1: (a) Communication interruption issue. With successful communication, the perception range of the green vehicle is expanded with the help of the supportive message from the yellow vehicle, effectively addressing occlusion and long-range challenges. When communication interruption between the two vehicles happens, the green vehicle fails to receive the cooperative message, causing missing detections and security risks. (b) Cooperative perception performance with different packet drop rates on the V2X-Sim1.0 li2021learning dataset. It is evident that the performance experiences significant degradation as a direct consequence of communication interruptions, represented by the blue curve. Fortuitously, the proposed interruption-aware cooperative perception system, represented by the red curve, adeptly mitigates the observed degradation.
  • Figure 2: Illustration of cooperative perception with/without communication interruption. (a) With ideal communication, each cooperation node can succeed to receive messages from all its neighbours at each timestep. (b) With consideration of communication interruption, interruption randomly happens so that each cooperation node can only receive messages from parts of its neighbours. Communication between each agents pair at each step interrupts with a certain interruption probability.
  • Figure 3: Overview of V2X-INCOP system. The system adopts an intermediate cooperation paradigm and fuses features by spatial attentive feature fusion model. When communication interruption happens, V2X-INCOP recovers the missing information from the cooperation history through the missing information recovery process, which predicts the current state of the features of past timesteps from the communication-adaptive historical information with a multi-scale spatial-temporal prediction model. Then we regard the recovered feature as the feature from a pseudo cooperation node that can compensate for the missing information due to the communication interruption to complete the current feature fusion.
  • Figure 4: Spatial attentive feature fusion (SAFF) model. The SAFF model first computes the spatial attention weight of each feature and then fuse them according to the weight, which captures the informative regions for the ego node of each cooperation feature and obtains the fused feature adaptive to the communication condition.
  • Figure 5: Multi-scale spatial-temporal prediction model. (a) The model first extracts multi-scale spatial-temporal features of historical information and then predicts the recovered feature. (b) The model is given explicit supervision during training with knowledge distillation from a teacher model without communication interruptions.
  • ...and 8 more figures