Table of Contents
Fetching ...

Generalizable Trajectory Prediction via Inverse Reinforcement Learning with Mamba-Graph Architecture

Wenyun Li, Wenjie Huang, Zejian Deng, Chen Sun

TL;DR

The paper tackles trajectory prediction under domain shift in urban driving. It introduces an environment-aware Mamba predictor that combines long-range sequence modeling with a Graph Attention Network to capture inter-vehicle interactions, augmented by MaxEnt IRL to learn diverse reward functions from human demonstrations. A policy-extension for OOD scenarios and TD3-based off-policy learning enable robust cross-domain adaptation without target-ground-truth data, while ablations quantify the contributions of IRL and GAT components. Empirical results across urban intersections, roundabouts, and highway scenarios show state-of-the-art accuracy and substantially improved cross-domain generalization, with performance competitive to fine-tuning in unseen environments. The work advances practical autonomous driving by enabling robust, transferable trajectory predictions across varied traffic scenarios.

Abstract

Accurate driving behavior modeling is fundamental to safe and efficient trajectory prediction, yet remains challenging in complex traffic scenarios. This paper presents a novel Inverse Reinforcement Learning (IRL) framework that captures human-like decision-making by inferring diverse reward functions, enabling robust cross-scenario adaptability. The learned reward function is utilized to maximize the likelihood of output by integrating Mamba blocks for efficient long-sequence dependency modeling with graph attention networks to encode spatial interactions among traffic agents. Comprehensive evaluations on urban intersections and roundabouts demonstrate that the proposed method not only outperforms various popular approaches in terms of prediction accuracy but also achieves 2.3 times higher generalization performance to unseen scenarios compared to other baselines, achieving adaptability in Out-of-Distribution settings that is competitive with fine-tuning.

Generalizable Trajectory Prediction via Inverse Reinforcement Learning with Mamba-Graph Architecture

TL;DR

The paper tackles trajectory prediction under domain shift in urban driving. It introduces an environment-aware Mamba predictor that combines long-range sequence modeling with a Graph Attention Network to capture inter-vehicle interactions, augmented by MaxEnt IRL to learn diverse reward functions from human demonstrations. A policy-extension for OOD scenarios and TD3-based off-policy learning enable robust cross-domain adaptation without target-ground-truth data, while ablations quantify the contributions of IRL and GAT components. Empirical results across urban intersections, roundabouts, and highway scenarios show state-of-the-art accuracy and substantially improved cross-domain generalization, with performance competitive to fine-tuning in unseen environments. The work advances practical autonomous driving by enabling robust, transferable trajectory predictions across varied traffic scenarios.

Abstract

Accurate driving behavior modeling is fundamental to safe and efficient trajectory prediction, yet remains challenging in complex traffic scenarios. This paper presents a novel Inverse Reinforcement Learning (IRL) framework that captures human-like decision-making by inferring diverse reward functions, enabling robust cross-scenario adaptability. The learned reward function is utilized to maximize the likelihood of output by integrating Mamba blocks for efficient long-sequence dependency modeling with graph attention networks to encode spatial interactions among traffic agents. Comprehensive evaluations on urban intersections and roundabouts demonstrate that the proposed method not only outperforms various popular approaches in terms of prediction accuracy but also achieves 2.3 times higher generalization performance to unseen scenarios compared to other baselines, achieving adaptability in Out-of-Distribution settings that is competitive with fine-tuning.

Paper Structure

This paper contains 11 sections, 12 equations, 7 figures, 5 tables, 2 algorithms.

Figures (7)

  • Figure 1: The illustration of Maximum Entropy Inverse Reinforcement Learning based driving behavior modeling on Cross-Scenario Adaptability.
  • Figure 2: The overall structure of the proposed trajectory prediction framework.
  • Figure 3: Environment and trajectory notation of urban intersection scenario at Heckstrasse Bock2019TheID.
  • Figure 4: Cross-Scenario Adaptability (CSA) scores by metric visualized in radar chart.
  • Figure 5: Metrics on validation set obtained from fine-tuning on 100 random instances of the highD dataset, exhibiting convergence within 6 epochs. Following experimental results, the input projection, time parameter projection, and output projection modules in the Mamba block are selected as the optimal targets for LoRA fine-tuning with a parameter compression rate of 3.91%.
  • ...and 2 more figures