Table of Contents
Fetching ...

CMP: Cooperative Motion Prediction with Multi-Agent Communication

Zehao Wang, Yuping Wang, Zhuoyuan Wu, Hengbo Ma, Zhaowei Li, Hang Qiu, Jiachen Li

TL;DR

The method, CMP, takes LiDAR signals as model input to enhance tracking and prediction capabilities and proposes a prediction aggregation module, which unifies the predictions obtained by different CAVs and generates the final prediction.

Abstract

The confluence of the advancement of Autonomous Vehicles (AVs) and the maturity of Vehicle-to-Everything (V2X) communication has enabled the capability of cooperative connected and automated vehicles (CAVs). Building on top of cooperative perception, this paper explores the feasibility and effectiveness of cooperative motion prediction. Our method, CMP, takes LiDAR signals as model input to enhance tracking and prediction capabilities. Unlike previous work that focuses separately on either cooperative perception or motion prediction, our framework, to the best of our knowledge, is the first to address the unified problem where CAVs share information in both perception and prediction modules. Incorporated into our design is the unique capability to tolerate realistic V2X transmission delays, while dealing with bulky perception representations. We also propose a prediction aggregation module, which unifies the predictions obtained by different CAVs and generates the final prediction. Through extensive experiments and ablation studies on the OPV2V and V2V4Real datasets, we demonstrate the effectiveness of our method in cooperative perception, tracking, and motion prediction. In particular, CMP reduces the average prediction error by 12.3% compared with the strongest baseline. Our work marks a significant step forward in the cooperative capabilities of CAVs, showcasing enhanced performance in complex scenarios. More details can be found on the project website: https://cmp-cooperative-prediction.github.io.

CMP: Cooperative Motion Prediction with Multi-Agent Communication

TL;DR

The method, CMP, takes LiDAR signals as model input to enhance tracking and prediction capabilities and proposes a prediction aggregation module, which unifies the predictions obtained by different CAVs and generates the final prediction.

Abstract

The confluence of the advancement of Autonomous Vehicles (AVs) and the maturity of Vehicle-to-Everything (V2X) communication has enabled the capability of cooperative connected and automated vehicles (CAVs). Building on top of cooperative perception, this paper explores the feasibility and effectiveness of cooperative motion prediction. Our method, CMP, takes LiDAR signals as model input to enhance tracking and prediction capabilities. Unlike previous work that focuses separately on either cooperative perception or motion prediction, our framework, to the best of our knowledge, is the first to address the unified problem where CAVs share information in both perception and prediction modules. Incorporated into our design is the unique capability to tolerate realistic V2X transmission delays, while dealing with bulky perception representations. We also propose a prediction aggregation module, which unifies the predictions obtained by different CAVs and generates the final prediction. Through extensive experiments and ablation studies on the OPV2V and V2V4Real datasets, we demonstrate the effectiveness of our method in cooperative perception, tracking, and motion prediction. In particular, CMP reduces the average prediction error by 12.3% compared with the strongest baseline. Our work marks a significant step forward in the cooperative capabilities of CAVs, showcasing enhanced performance in complex scenarios. More details can be found on the project website: https://cmp-cooperative-prediction.github.io.
Paper Structure (18 sections, 6 equations, 5 figures, 3 tables)

This paper contains 18 sections, 6 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: A comparison between the traditional pipeline and the proposed multi-vehicle cooperative prediction pipeline. (a) The traditional pipeline conducts perception and prediction based on a single AV's raw sensor data. (b) The proposed pipeline involves multiple cooperative CAVs, which share information to enhance both perception and prediction.
  • Figure 2: An overall diagram of the proposed cooperative motion prediction pipeline.
  • Figure 3: The visualizations of predicted trajectories under different model settings in two traffic scenarios. In (a) and (c), some surrounding vehicles are not detected and the predicted trajectories (colored waypoints) without cooperation deviate significantly from the ground truth (black lines). In contrast, in (b) and (d) where cooperative prediction is enabled, the predicted trajectories become closer to the ground truth due to additional useful information from others.
  • Figure 4: A comparison of motion prediction performance at 5s prediction horizon under different areas covered by CAVs in OPV2V. The area is calculated based on the smallest convex hull that covers all the CAVs. As the number of CAVs increases in different scenarios, more areas are likely covered, which boosts the performance gap between no cooperation and cooperative prediction.
  • Figure 5: Micro-benchmarking of the pipeline latency across different modules. Each bar represents the cumulative latency distribution for a module. Colored rectangles show the 25th to 75th percentile range, whiskers denote minimum and maximum values, and red dots indicate medians. The average latency for each module is displayed on the right.