Interruption-Aware Cooperative Perception for V2X Communication-Aided Autonomous Driving
Shunli Ren, Zixing Lei, Zi Wang, Mehrdad Dianati, Yafei Wang, Siheng Chen, Wenjun Zhang
TL;DR
This work tackles the practical problem of V2X communication interruptions in cooperative perception for autonomous driving. It introduces V2X-INCOP, a system that recovers missing cooperative information by leveraging history through a communication-adaptive multi-scale spatial-temporal predictor, augmented with knowledge distillation from an interruption-free oracle and curriculum learning for stable training. The approach uses a SAFF-based history fusion and a multi-scale predictor to generate current estimates of missing features, enabling pseudo-cooperation and robust late-fusion. Extensive experiments on V2X-Sim, OPV2V, and DAIR-V2X show consistent improvements over state-of-the-art methods across various packet drop rates, including significant gains in average precision and demonstrated resilience to pose noise, indicating strong practical potential for interruption-aware cooperative perception.
Abstract
Cooperative perception can significantly improve the perception performance of autonomous vehicles beyond the limited perception ability of individual vehicles by exchanging information with neighbor agents through V2X communication. However, most existing work assume ideal communication among agents, ignoring the significant and common \textit{interruption issues} caused by imperfect V2X communication, where cooperation agents can not receive cooperative messages successfully and thus fail to achieve cooperative perception, leading to safety risks. To fully reap the benefits of cooperative perception in practice, we propose V2X communication INterruption-aware COoperative Perception (V2X-INCOP), a cooperative perception system robust to communication interruption for V2X communication-aided autonomous driving, which leverages historical cooperation information to recover missing information due to the interruptions and alleviate the impact of the interruption issue. To achieve comprehensive recovery, we design a communication-adaptive multi-scale spatial-temporal prediction model to extract multi-scale spatial-temporal features based on V2X communication conditions and capture the most significant information for the prediction of the missing information. To further improve recovery performance, we adopt a knowledge distillation framework to give explicit and direct supervision to the prediction model and a curriculum learning strategy to stabilize the training of the model. Experiments on three public cooperative perception datasets demonstrate that the proposed method is effective in alleviating the impacts of communication interruption on cooperative perception.
