MobilityGPT: Enhanced Human Mobility Modeling with a GPT model
Ammar Haydari, Dongjie Chen, Zhengfeng Lai, Michael Zhang, Chen-Nee Chuah
TL;DR
MobilityGPT reframes human mobility trajectory generation as autoregressive sequence synthesis using a GPT-like transformer. It introduces gravity-based sampling and a road connectivity matrix to enforce geospatial realism, and encodes trajectories as road-link tokens with an end-of-trajectory sentinel to manage sequence boundaries and reduce GPS noise. A novel Reinforcement Learning from Trajectory Feedback (RLTF) framework trains a lightweight reward model and applies PPO-based fine-tuning to align generated paths with real data in terms of origin-destination similarity, trip length, radius, and connectivity. Experiments on Porto and Beijing taxi data show MobilityGPT outperforms state-of-the-art baselines across multiple distributional metrics, demonstrating high-fidelity, semantically realistic mobility trajectories while respecting road-network constraints. The work also analyzes ablations, tokenizer choices, and computational aspects, and provides code and data references for reproducibility.
Abstract
Generative models have shown promising results in capturing human mobility characteristics and generating synthetic trajectories. However, it remains challenging to ensure that the generated geospatial mobility data is semantically realistic, including consistent location sequences, and reflects real-world characteristics, such as constraining on geospatial limits. We reformat human mobility modeling as an autoregressive generation task to address these issues, leveraging the Generative Pre-trained Transformer (GPT) architecture. To ensure its controllable generation to alleviate the above challenges, we propose a geospatially-aware generative model, MobilityGPT. We propose a gravity-based sampling method to train a transformer for semantic sequence similarity. Then, we constrained the training process via a road connectivity matrix that provides the connectivity of sequences in trajectory generation, thereby keeping generated trajectories in geospatial limits. Lastly, we proposed to construct a preference dataset for fine-tuning MobilityGPT via Reinforcement Learning from Trajectory Feedback (RLTF) mechanism, which minimizes the travel distance between training and the synthetically generated trajectories. Experiments on real-world datasets demonstrate MobilityGPT's superior performance over state-of-the-art methods in generating high-quality mobility trajectories that are closest to real data in terms of origin-destination similarity, trip length, travel radius, link, and gravity distributions. We release the source code and reference links to datasets at https://github.com/ammarhydr/MobilityGPT.
