A Spatial Calibration Method for Robust Cooperative Perception
Zhiying Song, Tenghui Xie, Hailiang Zhang, Jiaxin Liu, Fuxi Wen, Jun Li
TL;DR
The paper tackles robust spatial calibration for cooperative perception under pose and perception noise. It proposes context-based matching (CBM), a lightweight, bounding-box–only inter-agent object association framework that builds intra-agent context, performs coarse matching with global consensus, and estimates the relative transform to fuse multi-view detections. CBM achieves decimeter-level relative pose accuracy and shows strong resilience to non-co-visible objects and measurement noise, outperforming prior methods on real-world (SIND) and simulated (OPV2V) datasets. The approach enables reliable V2X perception with minimal feature extraction and communication, suitable for scalable deployment in intelligent transportation systems. The results indicate significant improvements in transform accuracy (RRE, RTE) and perception quality (mAP) under varied localization errors.
Abstract
Cooperative perception is a promising technique for intelligent and connected vehicles through vehicle-to-everything (V2X) cooperation, provided that accurate pose information and relative pose transforms are available. Nevertheless, obtaining precise positioning information often entails high costs associated with navigation systems. {Hence, it is required to calibrate relative pose information for multi-agent cooperative perception.} This paper proposes a simple but effective object association approach named context-based matching (CBM), which identifies inter-agent object correspondences using intra-agent geometrical context. In detail, this method constructs contexts using the relative position of the detected bounding boxes, followed by local context matching and global consensus maximization. The optimal relative pose transform is estimated based on the matched correspondences, followed by cooperative perception fusion. Extensive experiments are conducted on both the simulated and real-world datasets. Even with larger inter-agent localization errors, high object association precision and decimeter-level relative pose calibration accuracy are achieved among the cooperating agents.
