Traj-Transformer: Diffusion Models with Transformer for GPS Trajectory Generation
Zhiyang Zhang, Ningcong Chen, Xin Zhang, Yanhua Li, Shen Su, Hui Lu, Jun Luo
TL;DR
The paper tackles GPS trajectory generation with diffusion models, addressing the loss of street-level detail seen in convolution-based approaches. It introduces Traj-Transformer, a Transformer-based diffusion model that supports two GPS point embeddings (loc-emb and lon-lat-emb) and uses adaLN to condition noise prediction on timesteps and auxiliary information. Empirical results on Chengdu and Xi’an show that lon-lat-emb with larger Transformer capacities yields superior trajectory fidelity and reduced deviation compared with UNet-based baselines, without requiring road-network data. The work demonstrates that transformer architectures can provide high-quality, road-network-unaware trajectory generation and suggests avenues for fully transformer-based end-to-end pipelines, including integration with RoadMAE for conditioning.
Abstract
The widespread use of GPS devices has driven advances in spatiotemporal data mining, enabling machine learning models to simulate human decision making and generate realistic trajectories, addressing both data collection costs and privacy concerns. Recent studies have shown the promise of diffusion models for high-quality trajectory generation. However, most existing methods rely on convolution based architectures (e.g. UNet) to predict noise during the diffusion process, which often results in notable deviations and the loss of fine-grained street-level details due to limited model capacity. In this paper, we propose Trajectory Transformer, a novel model that employs a transformer backbone for both conditional information embedding and noise prediction. We explore two GPS coordinate embedding strategies, location embedding and longitude-latitude embedding, and analyze model performance at different scales. Experiments on two real-world datasets demonstrate that Trajectory Transformer significantly enhances generation quality and effectively alleviates the deviation issues observed in prior approaches.
