Table of Contents
Fetching ...

Pre-trained Transformer Uncovers Meaningful Patterns in Human Mobility Data

Alameen Najjar

TL;DR

This work demonstrates that a transformer pre-trained on millions of unlabeled human mobility trajectories can learn embeddings that, after adaptation, encode high-level geospatial concepts such as population, land cover, and administrative divisions. By applying Masked Trajectory Modeling on a country-scale dataset and evaluating via fine-tuning, few-shot, and zero-shot adaptation, the study provides evidence that pre-training yields meaningful spatial representations that improve diverse downstream tasks. The findings show notable improvements across region- and trajectory-level analyses, including land-cover regression ($R^2$ gains of $0.26$–$0.38$) and trajectory diversity and length metrics ($R^2$ up to ~0.30 and ~0.70 respectively), highlighting the practical potential forGeoAI applications in urban planning and policy. The work also discusses privacy implications and outlines future directions such as incorporating temporal dynamics and differential privacy to enhance both performance and ethical safeguards.

Abstract

We empirically demonstrate that a transformer pre-trained on country-scale unlabeled human mobility data learns embeddings capable, through fine-tuning, of developing a deep understanding of the target geography and its corresponding mobility patterns. Utilizing an adaptation framework, we evaluate the performance of our pre-trained embeddings in encapsulating a broad spectrum of concepts directly and indirectly related to human mobility. This includes basic notions, such as geographic location and distance, and extends to more complex constructs, such as administrative divisions and land cover. Our extensive empirical analysis reveals a substantial performance boost gained from pre-training, reaching up to 38% in tasks such as tree-cover regression. We attribute this result to the ability of the pre-training to uncover meaningful patterns hidden in the raw data, beneficial for modeling relevant high-level concepts. The pre-trained embeddings emerge as robust representations of regions and trajectories, potentially valuable for a wide range of downstream applications.

Pre-trained Transformer Uncovers Meaningful Patterns in Human Mobility Data

TL;DR

This work demonstrates that a transformer pre-trained on millions of unlabeled human mobility trajectories can learn embeddings that, after adaptation, encode high-level geospatial concepts such as population, land cover, and administrative divisions. By applying Masked Trajectory Modeling on a country-scale dataset and evaluating via fine-tuning, few-shot, and zero-shot adaptation, the study provides evidence that pre-training yields meaningful spatial representations that improve diverse downstream tasks. The findings show notable improvements across region- and trajectory-level analyses, including land-cover regression ( gains of ) and trajectory diversity and length metrics ( up to ~0.30 and ~0.70 respectively), highlighting the practical potential forGeoAI applications in urban planning and policy. The work also discusses privacy implications and outlines future directions such as incorporating temporal dynamics and differential privacy to enhance both performance and ethical safeguards.

Abstract

We empirically demonstrate that a transformer pre-trained on country-scale unlabeled human mobility data learns embeddings capable, through fine-tuning, of developing a deep understanding of the target geography and its corresponding mobility patterns. Utilizing an adaptation framework, we evaluate the performance of our pre-trained embeddings in encapsulating a broad spectrum of concepts directly and indirectly related to human mobility. This includes basic notions, such as geographic location and distance, and extends to more complex constructs, such as administrative divisions and land cover. Our extensive empirical analysis reveals a substantial performance boost gained from pre-training, reaching up to 38% in tasks such as tree-cover regression. We attribute this result to the ability of the pre-training to uncover meaningful patterns hidden in the raw data, beneficial for modeling relevant high-level concepts. The pre-trained embeddings emerge as robust representations of regions and trajectories, potentially valuable for a wide range of downstream applications.
Paper Structure (26 sections, 11 figures, 14 tables)

This paper contains 26 sections, 11 figures, 14 tables.

Figures (11)

  • Figure 1: A transformer pre-trained from scratch on country-scale unlabeled human mobility data is adapted to model a variety of high-level concepts manifesting at different levels of spatial analysis.
  • Figure 2: GPS data points aggregated using an Uber H3 grid of resolution 8. Missing polygons indicate data gaps.
  • Figure 3: Two example hexagons and their central coordinates marked in red.
  • Figure 4: Population (deciles) map of Kyoto prefecture: Ground-truth and inferred.
  • Figure 5: Municipality classification map of Aichi prefecture: Ground-truth and inferred.
  • ...and 6 more figures