Vision-based Multi-future Trajectory Prediction: A Survey

Renhao Huang; Hao Xue; Maurice Pagnucco; Flora Salim; Yang Song

Vision-based Multi-future Trajectory Prediction: A Survey

Renhao Huang, Hao Xue, Maurice Pagnucco, Flora Salim, Yang Song

TL;DR

This survey addresses multi-future trajectory prediction (MTP) in vision-based settings, framing the problem as predicting $K$ plausible future paths for each agent given past motion and scene context. It introduces a taxonomy of MTP frameworks—noise-based, anchor-conditioned, and recurrent-based—and surveys datasets, evaluation metrics, and experimental results, including distribution-aware analyses on the ForkingPath dataset. The paper provides critical comparisons across leading MTP models on standard benchmarks, highlights the strengths and limitations of current evaluation approaches (e.g., MoN, MR, KDE-NLL, AMD/AMV), and discusses practical directions for metrics, data, efficiency, and explainability. Finally, it outlines future avenues such as MTP-robust motion planning integration, language-guided explainable MTP, and cross-domain diverse learning tasks beyond pedestrian and vehicle trajectory prediction, aiming to advance safe and scalable autonomous systems. $K$-based predictions, distribution coverage, and social-acceptance constraints are central to these developments, with implications for real-time planning and risk assessment in complex scenes.

Abstract

Vision-based trajectory prediction is an important task that supports safe and intelligent behaviours in autonomous systems. Many advanced approaches have been proposed over the years with improved spatial and temporal feature extraction. However, human behaviour is naturally diverse and uncertain. Given the past trajectory and surrounding environment information, an agent can have multiple plausible trajectories in the future. To tackle this problem, an essential task named multi-future trajectory prediction (MTP) has recently been studied. This task aims to generate a diverse, acceptable and explainable distribution of future predictions for each agent. In this paper, we present the first survey for MTP with our unique taxonomies and a comprehensive analysis of frameworks, datasets and evaluation metrics. We also compare models on existing MTP datasets and conduct experiments on the ForkingPath dataset. Finally, we discuss multiple future directions that can help researchers develop novel multi-future trajectory prediction systems and other diverse learning tasks similar to MTP.

Vision-based Multi-future Trajectory Prediction: A Survey

TL;DR

This survey addresses multi-future trajectory prediction (MTP) in vision-based settings, framing the problem as predicting

plausible future paths for each agent given past motion and scene context. It introduces a taxonomy of MTP frameworks—noise-based, anchor-conditioned, and recurrent-based—and surveys datasets, evaluation metrics, and experimental results, including distribution-aware analyses on the ForkingPath dataset. The paper provides critical comparisons across leading MTP models on standard benchmarks, highlights the strengths and limitations of current evaluation approaches (e.g., MoN, MR, KDE-NLL, AMD/AMV), and discusses practical directions for metrics, data, efficiency, and explainability. Finally, it outlines future avenues such as MTP-robust motion planning integration, language-guided explainable MTP, and cross-domain diverse learning tasks beyond pedestrian and vehicle trajectory prediction, aiming to advance safe and scalable autonomous systems.

-based predictions, distribution coverage, and social-acceptance constraints are central to these developments, with implications for real-time planning and risk assessment in complex scenes.

Abstract

Paper Structure (30 sections, 15 equations, 6 figures, 7 tables)

This paper contains 30 sections, 15 equations, 6 figures, 7 tables.

Introduction
Uncertainty of Human's Future Behaviour
Multi-future Trajectory Prediction
A Survey for MTP
Background: General Framework of Deep Learning-based Trajectory Prediction
Input Data
Overall Problem Definition
Feature Extraction
Decoding
Relationship between STP and MTP
Frameworks for MTP
Noise-based MTP Framework
Anchor Conditioned MTP Framework
Recurrent-based MTP Framework
Other Techniques for Improved MTP
...and 15 more sections

Figures (6)

Figure 1: An example of Multi-future Trajectory Prediction. The blue, green and red lines are the observed, ground truth and other possible paths respectively. The agent can have multiple plausible trajectories including the ground truth given an observed path.
Figure 2: Background: pipeline of deep learning-based trajectory prediction and the role of MTP. This survey mainly focuses on the decoders in MTP frameworks and MTP Evaluation.
Figure 3: An overview of the taxonomy of MTP frameworks.
Figure 4: General Pipelines of MTP frameworks. The block with multiple layers is executed repeatedly with $K$ different inputs. Dashed blue lines and blocks are executed for training only.
Figure 5: Examples of MTP satisfying accuracy (Acc), diversity (Div) and social-acceptance (Soc), where the shaded part denotes the predicted distributions. Case studies: (a) Predictions do not cover the ground truth. (b) Predictions hit the ground truth but only cover a single mode. (c) Predictions cover the ground truth and are diverse, but some of them enter the non-walkable zone. (d) The expected predictions.
...and 1 more figures

Vision-based Multi-future Trajectory Prediction: A Survey

TL;DR

Abstract

Vision-based Multi-future Trajectory Prediction: A Survey

Authors

TL;DR

Abstract

Table of Contents

Figures (6)