Table of Contents
Fetching ...

Training Trajectory Predictors Without Ground-Truth Data

Mikolaj Kliniewski, Jesse Morris, Ian R. Manchester, Viorela Ila

TL;DR

This work tackles the problem of cross-environment robustness in trajectory prediction by removing reliance on ground-truth data in training. It introduces DynoSAM, a dynamic SLAM-based estimation pipeline, to extract accurate motion from raw sensor data and feed Trajectron++ for multi-agent forecasting, with training performed on estimation-derived inputs. A new Absolute Consistency Error (ACE) metric assesses temporal stability, and results show that models trained on DynoSAM data can outperform GT-based ones, especially under limited data regimes, highlighting the approach's practical value for real-world autonomous navigation. The framework advances end-to-end motion estimation and prediction, enabling safer, real-time decision-making across diverse environments.

Abstract

This paper presents a framework capable of accurately and smoothly estimating position, heading, and velocity. Using this high-quality input, we propose a system based on Trajectron++, able to consistently generate precise trajectory predictions. Unlike conventional models that require ground-truth data for training, our approach eliminates this dependency. Our analysis demonstrates that poor quality input leads to noisy and unreliable predictions, which can be detrimental to navigation modules. We evaluate both input data quality and model output to illustrate the impact of input noise. Furthermore, we show that our estimation system enables effective training of trajectory prediction models even with limited data, producing robust predictions across different environments. Accurate estimations are crucial for deploying trajectory prediction models in real-world scenarios, and our system ensures meaningful and reliable results across various application contexts.

Training Trajectory Predictors Without Ground-Truth Data

TL;DR

This work tackles the problem of cross-environment robustness in trajectory prediction by removing reliance on ground-truth data in training. It introduces DynoSAM, a dynamic SLAM-based estimation pipeline, to extract accurate motion from raw sensor data and feed Trajectron++ for multi-agent forecasting, with training performed on estimation-derived inputs. A new Absolute Consistency Error (ACE) metric assesses temporal stability, and results show that models trained on DynoSAM data can outperform GT-based ones, especially under limited data regimes, highlighting the approach's practical value for real-world autonomous navigation. The framework advances end-to-end motion estimation and prediction, enabling safer, real-time decision-making across diverse environments.

Abstract

This paper presents a framework capable of accurately and smoothly estimating position, heading, and velocity. Using this high-quality input, we propose a system based on Trajectron++, able to consistently generate precise trajectory predictions. Unlike conventional models that require ground-truth data for training, our approach eliminates this dependency. Our analysis demonstrates that poor quality input leads to noisy and unreliable predictions, which can be detrimental to navigation modules. We evaluate both input data quality and model output to illustrate the impact of input noise. Furthermore, we show that our estimation system enables effective training of trajectory prediction models even with limited data, producing robust predictions across different environments. Accurate estimations are crucial for deploying trajectory prediction models in real-world scenarios, and our system ensures meaningful and reliable results across various application contexts.

Paper Structure

This paper contains 17 sections, 5 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: System diagram. Our pipeline processes RGB-D sensor data through the DynoSAM estimation module, which computes object estimates. The estimated position, heading, and velocity are fed into the Trajectron++ model that generates future 2D positions of the object.
  • Figure 2: Euclidean distance between consecutive object positions across the trajectory for three data sources. Noisier curves indicate less physically feasible estimations.
  • Figure 5: $\mathbf{ACE}$ of object $2$ in KITTI $00$ per three models.
  • Figure 6: Object's heading values across the trajectory for three data sources. Noisier curves indicate less physically feasible estimations.