Table of Contents
Fetching ...

AMEND: A Mixture of Experts Framework for Long-tailed Trajectory Prediction

Ray Coden Mercurius, Ehsan Ahmadi, Soheil Mohamad Alizadeh Shabestary, Amir Rasouli

TL;DR

AMEND addresses long-tailed pedestrian trajectory prediction by partitioning data into subdomains via latent-space clustering and training specialized experts for each cluster. A lightweight router is learned to assign each input to the most competent expert, enabling a winner-takes-all inference with no additional computational cost. The approach is model-agnostic and validated on ETH-UCY, where it outperforms baselines on challenging tail scenarios; ablations confirm that clustering, expert specialization, and routing contribute to the improvements. This framework offers a scalable, modular solution for robust tail-aware trajectory forecasting in autonomous driving contexts.

Abstract

Accurate prediction of pedestrians' future motions is critical for intelligent driving systems. Developing models for this task requires rich datasets containing diverse sets of samples. However, the existing naturalistic trajectory prediction datasets are generally imbalanced in favor of simpler samples and lack challenging scenarios. Such a long-tail effect causes prediction models to underperform on the tail portion of the data distribution containing safety-critical scenarios. Previous methods tackle the long-tail problem using methods such as contrastive learning and class-conditioned hypernetworks. These approaches, however, are not modular and cannot be applied to many machine learning architectures. In this work, we propose a modular model-agnostic framework for trajectory prediction that leverages a specialized mixture of experts. In our approach, each expert is trained with a specialized skill with respect to a particular part of the data. To produce predictions, we utilise a router network that selects the best expert by generating relative confidence scores. We conduct experimentation on common pedestrian trajectory prediction datasets and show that our method improves performance on long-tail scenarios. We further conduct ablation studies to highlight the contribution of different proposed components.

AMEND: A Mixture of Experts Framework for Long-tailed Trajectory Prediction

TL;DR

AMEND addresses long-tailed pedestrian trajectory prediction by partitioning data into subdomains via latent-space clustering and training specialized experts for each cluster. A lightweight router is learned to assign each input to the most competent expert, enabling a winner-takes-all inference with no additional computational cost. The approach is model-agnostic and validated on ETH-UCY, where it outperforms baselines on challenging tail scenarios; ablations confirm that clustering, expert specialization, and routing contribute to the improvements. This framework offers a scalable, modular solution for robust tail-aware trajectory forecasting in autonomous driving contexts.

Abstract

Accurate prediction of pedestrians' future motions is critical for intelligent driving systems. Developing models for this task requires rich datasets containing diverse sets of samples. However, the existing naturalistic trajectory prediction datasets are generally imbalanced in favor of simpler samples and lack challenging scenarios. Such a long-tail effect causes prediction models to underperform on the tail portion of the data distribution containing safety-critical scenarios. Previous methods tackle the long-tail problem using methods such as contrastive learning and class-conditioned hypernetworks. These approaches, however, are not modular and cannot be applied to many machine learning architectures. In this work, we propose a modular model-agnostic framework for trajectory prediction that leverages a specialized mixture of experts. In our approach, each expert is trained with a specialized skill with respect to a particular part of the data. To produce predictions, we utilise a router network that selects the best expert by generating relative confidence scores. We conduct experimentation on common pedestrian trajectory prediction datasets and show that our method improves performance on long-tail scenarios. We further conduct ablation studies to highlight the contribution of different proposed components.
Paper Structure (16 sections, 4 equations, 2 figures, 3 tables)

This paper contains 16 sections, 4 equations, 2 figures, 3 tables.

Figures (2)

  • Figure 1: Overview of the proposed approach. We cluster the data samples based on the latent vector of an encoder network. During training of the experts the loss function is adjusted so that each expert focuses on a particular sample cluster. Next, we calculate the relative performance rankings of the experts, which are used to generate targets to train the router network. At inference we use the router network to select best expert to generate the predictions.
  • Figure 2: Radar plot showing the FDE of each expert on the different cluster splits of the ETH-UCY Hotel test dataset showing significant variations in the performance of the experts. The best performing expert for each cluster tends to be the one that was assigned to it during training.