Learning Cooperative Trajectory Representations for Motion Forecasting

Hongzhi Ruan; Haibao Yu; Wenxian Yang; Siqi Fan; Zaiqing Nie

Learning Cooperative Trajectory Representations for Motion Forecasting

Hongzhi Ruan, Haibao Yu, Wenxian Yang, Siqi Fan, Zaiqing Nie

TL;DR

This paper presents V2X-Graph, a representative framework to achieve interpretable and end-to-end trajectory feature fusion for cooperative motion forecasting, and constructs the first real-world V2X motion forecasting dataset V2X-Traj, which contains multiple autonomous vehicles and infrastructure in every scenario.

Abstract

Motion forecasting is an essential task for autonomous driving, and utilizing information from infrastructure and other vehicles can enhance forecasting capabilities. Existing research mainly focuses on leveraging single-frame cooperative information to enhance the limited perception capability of the ego vehicle, while underutilizing the motion and interaction context of traffic participants observed from cooperative devices. In this paper, we propose a forecasting-oriented representation paradigm to utilize motion and interaction features from cooperative information. Specifically, we present V2X-Graph, a representative framework to achieve interpretable and end-to-end trajectory feature fusion for cooperative motion forecasting. V2X-Graph is evaluated on V2X-Seq in vehicle-to-infrastructure (V2I) scenarios. To further evaluate on vehicle-to-everything (V2X) scenario, we construct the first real-world V2X motion forecasting dataset V2X-Traj, which contains multiple autonomous vehicles and infrastructure in every scenario. Experimental results on both V2X-Seq and V2X-Traj show the advantage of our method. We hope both V2X-Graph and V2X-Traj will benefit the further development of cooperative motion forecasting. Find the project at https://github.com/AIR-THU/V2X-Graph.

Learning Cooperative Trajectory Representations for Motion Forecasting

TL;DR

Abstract

Paper Structure (28 sections, 11 equations, 10 figures, 12 tables, 1 algorithm)

This paper contains 28 sections, 11 equations, 10 figures, 12 tables, 1 algorithm.

Introduction
Related Work
Preliminary
Methodology
Scene Representation with Graph
Feature Fusion with Interpretable Graph
Multimodal Future Decoder
Training Losses
Experiment
Experimental Setup
Main Results
Ablation Study
Conclusion and Limitation
Details of Training Loss
Details of V2X-Traj Dataset
...and 13 more sections

Figures (10)

Figure 1: Scheme Comparison. (a) Existing methods utilize cooperative perception information at each frame individually then performs forecasting. (b) Our V2X-Graph considers this information from a typical forecasting perspective and employs interpretable trajectory feature fusion in an end-to-end manner, to enhance the historical representation of agents for cooperative motion forecasting.
Figure 2: V2X-Graph overview. Trajectories from the ego-view and other views, along with vector map information, are encoded as nodes and edges for graph construction to represent a cooperative scenario. The novel interpretable graph provides guidance for forecasting-oriented trajectory feature fusion, including motion and interaction features. In this figure, solid rectangles represent encodings of ego-view trajectories, hollow circles represent encodings of cooperative trajectories, distinguished by distinct colors. Specifically, within the same view, the use of the same color indicates interruptions caused by occlusion. Triangles represent encodings of lane segments. In trajectory feature fusion, grey arrow indicates an missing frame in motion case, a lane segment vector in interaction case.
Figure 3: V2X-Traj dataset. (a) Statistics of the total number and average length for the 8 classes of agents. (b) Visualizations. Orange boxes represent autonomous vehicles, blue elements denote other traffic participants and the green box denotes the target agent needs to be predicted.
Figure 4: Effectiveness of pseudo label supervision.
Figure 5: Qualitative results on V2X-Traj. There are several interesting cooperative scenarios at the challenging intersection, including speed-up, lane changing and turning. We visualize only the forecasting results of the target agent in each scenario for clarity. The ground-truth trajectories are shown in red, and the multimodal predicted trajectories are shown in green.
...and 5 more figures

Learning Cooperative Trajectory Representations for Motion Forecasting

TL;DR

Abstract

Learning Cooperative Trajectory Representations for Motion Forecasting

Authors

TL;DR

Abstract

Table of Contents

Figures (10)