Generative Models for Synthetic Urban Mobility Data: A Systematic Literature Review
Alexandra Kapp, Julia Hansmeyer, Helena Mihaljević
TL;DR
This survey addresses the privacy-enabled generation of synthetic urban mobility data, a crucial need given the sensitivity of raw trajectories. It categorizes generative approaches into trips, user movements, and city-population paradigms, contrasting traditional DP/Markov methods with deep learning-based techniques and their hybrids. The study reveals substantial heterogeneity in data sources, evaluation metrics, and privacy guarantees, with many works lacking rigorous privacy assessments or standard benchmarks. It highlights the need for standardized benchmarking, transparent reporting of data properties, and open sharing of code and data, to enable reliable, practice-oriented deployment of synthetic mobility solutions.
Abstract
Although highly valuable for a variety of applications, urban mobility data is rarely made openly available as it contains sensitive personal information. Synthetic data aims to solve this issue by generating artificial data that resembles an original dataset in structural and statistical characteristics, but omits sensitive information. For mobility data, a large number of corresponding models have been proposed in the last decade. This systematic review provides a structured comparative overview of the current state of this heterogeneous, active field of research. A special focus is put on the applicability of the reviewed models in practice.
