Table of Contents
Fetching ...

MoveGPT: Scaling Mobility Foundation Models with Spatially-Aware Mixture of Experts

Chonghua Han, Yuan Yuan, Jingtao Ding, Jie Feng, Fanjin Meng, Yong Li

TL;DR

MoveGPT addresses scaling mobility data by introducing a universal location encoder and a Spatially-Aware Mixture-of-Experts Transformer, enabling cross-city pretraining on 1.5B samples from 16 cities. It reframes next-step mobility prediction as a scalable retrieval task using a DCN-generated candidate database, and uses STAR gating to specialize experts along spatial and temporal dimensions. The results show state-of-the-art performance across next-location, long-term, and generation tasks, with strong generalization to unseen cities and clear scaling laws that mirror NLP foundation models. This work paves the way for large-scale, transferable mobility intelligence with potential impacts on urban planning and transportation systems.

Abstract

The success of foundation models in language has inspired a new wave of general-purpose models for human mobility. However, existing approaches struggle to scale effectively due to two fundamental limitations: a failure to use meaningful basic units to represent movement, and an inability to capture the vast diversity of patterns found in large-scale data. In this work, we develop MoveGPT, a large-scale foundation model specifically architected to overcome these barriers. MoveGPT is built upon two key innovations: (1) a unified location encoder that maps geographically disjoint locations into a shared semantic space, enabling pre-training on a global scale; and (2) a Spatially-Aware Mixture-of-Experts Transformer that develops specialized experts to efficiently capture diverse mobility patterns. Pre-trained on billion-scale datasets, MoveGPT establishes a new state-of-the-art across a wide range of downstream tasks, achieving performance gains of up to 35% on average. It also demonstrates strong generalization capabilities to unseen cities. Crucially, our work provides empirical evidence of scaling ability in human mobility, validating a clear path toward building increasingly capable foundation models in this domain.

MoveGPT: Scaling Mobility Foundation Models with Spatially-Aware Mixture of Experts

TL;DR

MoveGPT addresses scaling mobility data by introducing a universal location encoder and a Spatially-Aware Mixture-of-Experts Transformer, enabling cross-city pretraining on 1.5B samples from 16 cities. It reframes next-step mobility prediction as a scalable retrieval task using a DCN-generated candidate database, and uses STAR gating to specialize experts along spatial and temporal dimensions. The results show state-of-the-art performance across next-location, long-term, and generation tasks, with strong generalization to unseen cities and clear scaling laws that mirror NLP foundation models. This work paves the way for large-scale, transferable mobility intelligence with potential impacts on urban planning and transportation systems.

Abstract

The success of foundation models in language has inspired a new wave of general-purpose models for human mobility. However, existing approaches struggle to scale effectively due to two fundamental limitations: a failure to use meaningful basic units to represent movement, and an inability to capture the vast diversity of patterns found in large-scale data. In this work, we develop MoveGPT, a large-scale foundation model specifically architected to overcome these barriers. MoveGPT is built upon two key innovations: (1) a unified location encoder that maps geographically disjoint locations into a shared semantic space, enabling pre-training on a global scale; and (2) a Spatially-Aware Mixture-of-Experts Transformer that develops specialized experts to efficiently capture diverse mobility patterns. Pre-trained on billion-scale datasets, MoveGPT establishes a new state-of-the-art across a wide range of downstream tasks, achieving performance gains of up to 35% on average. It also demonstrates strong generalization capabilities to unseen cities. Crucially, our work provides empirical evidence of scaling ability in human mobility, validating a clear path toward building increasingly capable foundation models in this domain.

Paper Structure

This paper contains 32 sections, 11 equations, 9 figures, 9 tables.

Figures (9)

  • Figure 1: Performance of MoveGPT and the best baseline.
  • Figure 2: Illustration of the whole framework of MoveGPT, including three key components: (a) Autoregressive pretraining for mobility modeling; (b) Unified location encoder; (c) Candidate embedding for similarity search.
  • Figure 3: SAMoE Transformer block and SpatiaL-Temporal-Adapted Router(STAR).
  • Figure 4: Transfer performance on new cities.
  • Figure 5: The distribution of the spatial router's weights across different layers in MoveGPT shows both cross-city similarities (e.g., the first layer highlighted in red) and city-specific characteristics (e.g., the fifth layer highlighted in purple).
  • ...and 4 more figures