Toward Unified Practices in Trajectory Prediction Research on Bird's-Eye-View Datasets

Theodor Westny; Björn Olofsson; Erik Frisk

Toward Unified Practices in Trajectory Prediction Research on Bird's-Eye-View Datasets

Theodor Westny, Björn Olofsson, Erik Frisk

TL;DR

This work argues that standardized preprocessing and evaluation are essential for fair comparisons in BEV trajectory prediction and introduces dronalize, an open-source PyTorch toolbox that unifies data handling across multiple BEV datasets. It prescribes a consistent pipeline for dataset splits, coordinate systems, downsampling with anti-aliasing, agent and map features, and a graph-based data structure to accommodate variable scene sizes. The toolbox implements common metrics for single- and multimodal prediction, including FDE and ANLL, while enabling both single-agent and multi-agent tasks with interaction-aware setups. By aligning preprocessing and evaluation practices across diverse datasets (e.g., highD, rounD, inD, exiD, uniD, SIND, INTERACTION), the approach aims to reduce reproducibility gaps and accelerate research in autonomous driving trajectory forecasting. The work also outlines future extensions to broaden dataset coverage and benchmarking capabilities, reinforcing the practical impact of standardized BEV trajectory research workflows.

Abstract

The availability of high-quality datasets is crucial for the development of behavior prediction algorithms in autonomous vehicles. This paper highlights the need to standardize the use of certain datasets for motion forecasting research to simplify comparative analysis and proposes a set of tools and practices to achieve this. Drawing on extensive experience and a comprehensive review of current literature, we summarize our proposals for preprocessing, visualization, and evaluation in the form of an open-sourced toolbox designed for researchers working on trajectory prediction problems. The clear specification of necessary preprocessing steps and evaluation metrics is intended to alleviate development efforts and facilitate the comparison of results across different studies. The toolbox is available at: https://github.com/westny/dronalize.

Toward Unified Practices in Trajectory Prediction Research on Bird's-Eye-View Datasets

TL;DR

Abstract

Paper Structure (17 sections, 3 equations, 5 figures, 5 tables)

This paper contains 17 sections, 3 equations, 5 figures, 5 tables.

Introduction
Contributions
Background
Related Work
Preprocessing
Prediction Objective
Dataset Splits
Coordinate System
Downsampling
Agent Features
Tracking Features
Agent Classes
High-Definition Maps
Data Structure
Evaluation Metrics
...and 2 more sections

Figures (5)

Figure 1: Example scenario from the rounD dataset. The color of the agents represents how they are scored in the prediction task. The blue vehicle is the single-agent target, the green vehicles are the multi-agent targets, and the red vehicles are the non-scored surrounding agents. The same-colored lines following each agent represent the observed past trajectory, while the same-colored dotted lines preceding them indicate their future trajectory.
Figure 2: Example of three different data splits, denoted by (a), (b), and (c), using the proposed partitioning method. The figure illustrates how the bins are divided into training, validation, and test sets based on the number of frames in each recording. The different colors are used to indicate which set each bin belongs to, where the dashed lines represent the boundaries between the bins.
Figure 3: Maneuver class hierarchy and corresponding values.
Figure 4: Example simulation of a vehicle moving along a circular path with a radius of $16$ m, a speed of $50$ km/h (speed limit) for a duration of $5$ s using different sampling rates in a forward Euler integration scheme. The values represent nominal conditions for the rounD dataset. The figure is used to illustrate how different sampling rates can affect motion prediction quality. This is something that could be of interest when assessing the physical feasibility of the predicted trajectories.
Figure 5: Example lane graphs for the rounD, inD, SIND, and INTERACTION datasets. The lane graphs are constructed using the available Lanelet files, which contain semantic information about the scene. The lane graphs are constructed by extracting lane boundaries and markings, coloring them according to their type, and connecting them with edges. Although these examples are simple, combined with the available traffic data, they could be sufficient for many trajectory prediction tasks. Using the provided functionality, users can develop more complex lane graphs based on the desired level of detail.

Toward Unified Practices in Trajectory Prediction Research on Bird's-Eye-View Datasets

TL;DR

Abstract

Toward Unified Practices in Trajectory Prediction Research on Bird's-Eye-View Datasets

Authors

TL;DR

Abstract

Table of Contents

Figures (5)