Table of Contents
Fetching ...

Language-Driven Interactive Traffic Trajectory Generation

Junkai Xia, Chenxin Xu, Qingyao Xu, Chen Xie, Yanfeng Wang, Siheng Chen

TL;DR

InteractTraj introduces language-driven interactive traffic trajectory generation by encoding natural language into interaction-aware numerical codes through a language-to-code encoder and decoding them into trajectories with a two-step, interaction-aware aggregation. The approach leverages map, vehicle, and interaction codes to fuse environmental context and multi-vehicle dynamics, enabling controllable generation of realistic interactive scenes. Experiments on WOMD and nuPlan show state-of-the-art realism with significant ADE/FDE improvements and strong user preference for language-conditioned interactions. This work advances simulation for autonomous driving by enabling flexible,-language guided, interactive traffic scenario generation and suggests future expansion to more participants and richer maps.

Abstract

Realistic trajectory generation with natural language control is pivotal for advancing autonomous vehicle technology. However, previous methods focus on individual traffic participant trajectory generation, thus failing to account for the complexity of interactive traffic dynamics. In this work, we propose InteractTraj, the first language-driven traffic trajectory generator that can generate interactive traffic trajectories. InteractTraj interprets abstract trajectory descriptions into concrete formatted interaction-aware numerical codes and learns a mapping between these formatted codes and the final interactive trajectories. To interpret language descriptions, we propose a language-to-code encoder with a novel interaction-aware encoding strategy. To produce interactive traffic trajectories, we propose a code-to-trajectory decoder with interaction-aware feature aggregation that synergizes vehicle interactions with the environmental map and the vehicle moves. Extensive experiments show our method demonstrates superior performance over previous SoTA methods, offering a more realistic generation of interactive traffic trajectories with high controllability via diverse natural language commands. Our code is available at https://github.com/X1a-jk/InteractTraj.git

Language-Driven Interactive Traffic Trajectory Generation

TL;DR

InteractTraj introduces language-driven interactive traffic trajectory generation by encoding natural language into interaction-aware numerical codes through a language-to-code encoder and decoding them into trajectories with a two-step, interaction-aware aggregation. The approach leverages map, vehicle, and interaction codes to fuse environmental context and multi-vehicle dynamics, enabling controllable generation of realistic interactive scenes. Experiments on WOMD and nuPlan show state-of-the-art realism with significant ADE/FDE improvements and strong user preference for language-conditioned interactions. This work advances simulation for autonomous driving by enabling flexible,-language guided, interactive traffic scenario generation and suggests future expansion to more participants and richer maps.

Abstract

Realistic trajectory generation with natural language control is pivotal for advancing autonomous vehicle technology. However, previous methods focus on individual traffic participant trajectory generation, thus failing to account for the complexity of interactive traffic dynamics. In this work, we propose InteractTraj, the first language-driven traffic trajectory generator that can generate interactive traffic trajectories. InteractTraj interprets abstract trajectory descriptions into concrete formatted interaction-aware numerical codes and learns a mapping between these formatted codes and the final interactive trajectories. To interpret language descriptions, we propose a language-to-code encoder with a novel interaction-aware encoding strategy. To produce interactive traffic trajectories, we propose a code-to-trajectory decoder with interaction-aware feature aggregation that synergizes vehicle interactions with the environmental map and the vehicle moves. Extensive experiments show our method demonstrates superior performance over previous SoTA methods, offering a more realistic generation of interactive traffic trajectories with high controllability via diverse natural language commands. Our code is available at https://github.com/X1a-jk/InteractTraj.git
Paper Structure (25 sections, 13 equations, 10 figures, 3 tables)

This paper contains 25 sections, 13 equations, 10 figures, 3 tables.

Figures (10)

  • Figure 1: Overview of InteractTraj. InteractTraj uses a series of semantic interaction-aware numerical codes to depict interactive trajectories. An LLM-based language-to-code encoder converts language descriptions into numerical codes, which are then transformed into interactive trajectories by a code-to-trajectory decoder.
  • Figure 2: Sketch of interaction-aware prompt and numerical codes.
  • Figure 3: The architecture of code-to-trajectory decoder. The decoder generates vehicle trajectories by fusing and decoding information between vehicles and interactions.
  • Figure 4: Comparison of model performances under different settings on WOMD. Lower is better. InteractTraj generates more realistic interactive trajectories for different types. ST: straight forward, LT: left turn, RT: right turn, LC: left lane change, RC: right lane change and AVG: average performance.
  • Figure 5: Comparison of model performances under different interaction types. InteractTraj generates trajectories that better align with language descriptions by performing the right vehicle interactions.
  • ...and 5 more figures