UniST: A Prompt-Empowered Universal Model for Urban Spatio-Temporal Prediction
Yuan Yuan, Jingtao Ding, Jie Feng, Depeng Jin, Yong Li
TL;DR
UniST tackles the lack of universality in urban spatio-temporal prediction by proposing a two-stage framework: large-scale pre-training over diverse spatio-temporal data and knowledge-guided prompt learning to adapt to varied patterns across scenarios. It employs a Transformer-based encoder–decoder with spatio-temporal patching, four self-supervised masking strategies, and a memory-augmented prompt learner that leverages domain knowledge (spatial closeness, hierarchy; temporal closeness, periodicity) to generate dynamic prompts for cross-dataset generalization. Empirical results across more than 20 datasets show UniST achieving state-of-the-art performance in short- and long-term predictions, with particularly strong few-shot and zero-shot capabilities, highlighting robust cross-domain transfer. The work demonstrates the practical potential of universal spatio-temporal models and suggests future integration of heterogeneous data formats (grid, sequence, graph) to further enhance universality and resilience in urban forecasting applications.
Abstract
Urban spatio-temporal prediction is crucial for informed decision-making, such as traffic management, resource optimization, and emergence response. Despite remarkable breakthroughs in pretrained natural language models that enable one model to handle diverse tasks, a universal solution for spatio-temporal prediction remains challenging Existing prediction approaches are typically tailored for specific spatio-temporal scenarios, requiring task-specific model designs and extensive domain-specific training data. In this study, we introduce UniST, a universal model designed for general urban spatio-temporal prediction across a wide range of scenarios. Inspired by large language models, UniST achieves success through: (i) utilizing diverse spatio-temporal data from different scenarios, (ii) effective pre-training to capture complex spatio-temporal dynamics, (iii) knowledge-guided prompts to enhance generalization capabilities. These designs together unlock the potential of building a universal model for various scenarios Extensive experiments on more than 20 spatio-temporal scenarios demonstrate UniST's efficacy in advancing state-of-the-art performance, especially in few-shot and zero-shot prediction. The datasets and code implementation are released on https://github.com/tsinghua-fib-lab/UniST.
