TrajGPT-R: Generating Urban Mobility Trajectory with Reinforcement Learning-Enhanced Generative Pre-trained Transformer

Jiawei Wang; Chuang Yang; Jiawei Yong; Xiaohang Xu; Hongjun Wang; Noboru Koshizuka; Shintaro Fukushima; Ryosuke Shibasaki; Renhe Jiang

TrajGPT-R: Generating Urban Mobility Trajectory with Reinforcement Learning-Enhanced Generative Pre-trained Transformer

Jiawei Wang, Chuang Yang, Jiawei Yong, Xiaohang Xu, Hongjun Wang, Noboru Koshizuka, Shintaro Fukushima, Ryosuke Shibasaki, Renhe Jiang

TL;DR

This research introduces a transformative framework for generating large-scale urban mobility trajectories, employing a novel application of a transformer-based model pre-trained and fine-tuned through a two-phase process, which markedly surpasses existing models in terms of reliability and diversity.

Abstract

Mobility trajectories are essential for understanding urban dynamics and enhancing urban planning, yet access to such data is frequently hindered by privacy concerns. This research introduces a transformative framework for generating large-scale urban mobility trajectories, employing a novel application of a transformer-based model pre-trained and fine-tuned through a two-phase process. Initially, trajectory generation is conceptualized as an offline reinforcement learning (RL) problem, with a significant reduction in vocabulary space achieved during tokenization. The integration of Inverse Reinforcement Learning (IRL) allows for the capture of trajectory-wise reward signals, leveraging historical data to infer individual mobility preferences. Subsequently, the pre-trained model is fine-tuned using the constructed reward model, effectively addressing the challenges inherent in traditional RL-based autoregressive methods, such as long-term credit assignment and handling of sparse reward environments. Comprehensive evaluations on multiple datasets illustrate that our framework markedly surpasses existing models in terms of reliability and diversity. Our findings not only advance the field of urban mobility modeling but also provide a robust methodology for simulating urban data, with significant implications for traffic management and urban development planning. The implementation is publicly available at https://github.com/Wangjw6/TrajGPT_R.

TrajGPT-R: Generating Urban Mobility Trajectory with Reinforcement Learning-Enhanced Generative Pre-trained Transformer

TL;DR

Abstract

Paper Structure (20 sections, 28 equations, 9 figures, 5 tables)

This paper contains 20 sections, 28 equations, 9 figures, 5 tables.

Introduction
Problem
Methodology
Offline-RL based Trajectory Generation Pretraining
Inverse RL based Reward Model Construction
Reward Model-based Fine-tuning
Results
Tasks and Datasets
Evaluation Metrics
Model Configurations
Baselines
Performance Evaluation
Interpretability Analysis
Discussion
Appendix
...and 5 more sections

Figures (9)

Figure 1: Trajectory generation as a sequential decision-making problem. The vehicle navigates in the urban city by making decisions to determine the downstream link at each link
Figure 2: Our proposed Two-phase framework to enhance pretrained generative model for urban mobility trajectory generation with reinforcement learning (TrajGPT-R).Phase 1: A Generative pre-trained Transformer (GPT) is developed to acquire the general knowledge for generating urban mobility trajectory data, meanwhile a reward model is constructed using inverse reinforcement learning to capture trajectory-wise preferences. Phase 2:Reward model-based fine-tuning (RMFT) scheme is introduced to enhance the pre-trained model for better generation reliability and diversity.
Figure 3: Autoressive decision-making process in Transformer-based mobility trajectory generation. At each generation step $t+1$, all preceding tokens contribute to predicting the action token at $t+1$. The significance of each token's contribution is determined by its attention score.
Figure 4: Visualization of generated and the ground-truth trajectories based on Toyota Dataset within the core area of Tokyo, Japan.The trajectories are drawn in red. We highlight different training phases. Compared with the ground truth, TrajGPT can present satisfying reproduction performance. Moreover, the fine-tuning phase (e.g., TrajGPT-DPO and TrajGPT-R) can improve the model's generalization ability.
Figure 5: Long trajectory visualization.We specially select and visualize the trajectories consisting of more than 50 links. The pre-trained model presents a worse reproduction. RMFT especially outperforms the other two phases in long trajectory modeling (see e.g., the western part of the map).
...and 4 more figures

TrajGPT-R: Generating Urban Mobility Trajectory with Reinforcement Learning-Enhanced Generative Pre-trained Transformer

TL;DR

Abstract

TrajGPT-R: Generating Urban Mobility Trajectory with Reinforcement Learning-Enhanced Generative Pre-trained Transformer

Authors

TL;DR

Abstract

Table of Contents

Figures (9)