Table of Contents
Fetching ...

Learning Demographic-Conditioned Mobility Trajectories with Aggregate Supervision

Jessie Z. Li, Zhiqing Hong, Toru Shirakawa, Serina Chang

TL;DR

Experiments show that ATLAS substantially improves demographic realism over baselines (JSD $\downarrow$ 12%--69%) and closes much of the gap to strongly supervised training.

Abstract

Human mobility trajectories are widely studied in public health and social science, where different demographic groups exhibit significantly different mobility patterns. However, existing trajectory generation models rarely capture this heterogeneity because most trajectory datasets lack demographic labels. To address this gap in data, we propose ATLAS, a weakly supervised approach for demographic-conditioned trajectory generation using only (i) individual trajectories without demographic labels, (ii) region-level aggregated mobility features, and (iii) region-level demographic compositions from census data. ATLAS trains a trajectory generator and fine-tunes it so that simulated mobility matches observed regional aggregates while conditioning on demographics. Experiments on real trajectory data with demographic labels show that ATLAS substantially improves demographic realism over baselines (JSD $\downarrow$ 12%--69%) and closes much of the gap to strongly supervised training. We further develop theoretical analyses for when and why ATLAS works, identifying key factors including demographic diversity across regions and the informativeness of the aggregate feature, paired with experiments demonstrating the practical implications of our theory. We release our code at https://github.com/schang-lab/ATLAS.

Learning Demographic-Conditioned Mobility Trajectories with Aggregate Supervision

TL;DR

Experiments show that ATLAS substantially improves demographic realism over baselines (JSD 12%--69%) and closes much of the gap to strongly supervised training.

Abstract

Human mobility trajectories are widely studied in public health and social science, where different demographic groups exhibit significantly different mobility patterns. However, existing trajectory generation models rarely capture this heterogeneity because most trajectory datasets lack demographic labels. To address this gap in data, we propose ATLAS, a weakly supervised approach for demographic-conditioned trajectory generation using only (i) individual trajectories without demographic labels, (ii) region-level aggregated mobility features, and (iii) region-level demographic compositions from census data. ATLAS trains a trajectory generator and fine-tunes it so that simulated mobility matches observed regional aggregates while conditioning on demographics. Experiments on real trajectory data with demographic labels show that ATLAS substantially improves demographic realism over baselines (JSD 12%--69%) and closes much of the gap to strongly supervised training. We further develop theoretical analyses for when and why ATLAS works, identifying key factors including demographic diversity across regions and the informativeness of the aggregate feature, paired with experiments demonstrating the practical implications of our theory. We release our code at https://github.com/schang-lab/ATLAS.
Paper Structure (71 sections, 13 theorems, 39 equations, 7 figures, 18 tables)

This paper contains 71 sections, 13 theorems, 39 equations, 7 figures, 18 tables.

Key Result

Lemma 1

Suppose Condition ass:full-rank holds. If the model-implied and true regional aggregates match, $\nu_\theta(g) = \nu_\star(g)$ for all regions $g$, then the demographic group-level feature means coincide:

Figures (7)

  • Figure 1: Overview of ATLAS. Phase 1: Train a generative model on trajectories without demographic labels. Phase 2: Fine-tune with demographic conditioning by sampling groups from the region's demographic composition $p(\cdot\mid g)$ and optimizing to match region's observed aggregate features $\nu_\star(g)$.
  • Figure 2: Effect of demographic diversity on ATLAS performance (RQ1). JSD (lower is better) across four metrics on Virginia (top) and California (bottom). Bars indicate average JSD over 8 demographic groups; errors indicate standard deviation over groups. ATLAS substantially improves over the baseline under well-conditioned regional partitions, often approaching strongly supervised performance, and degrades gracefully as partitions become ill-conditioned.
  • Figure B1: Pairwise JSD between demographic groups. Heatmaps showing pairwise Jensen--Shannon divergence (JSD) of POI visit distributions between $K{=}8$ age$\times$gender groups for Virginia (left) and California (right). JSD values range from approximately 0.40 to 0.54, with warmer colors indicating larger divergence.
  • Figure B2: Virginia: average JSD during finetuning. Validation loss (TV) and per-metric JSD (averaged across demographic groups) over training steps.
  • Figure B3: Virginia: per-group JSD during finetuning. Per-metric JSD broken down by demographic group over training steps.
  • ...and 2 more figures

Theorems & Definitions (28)

  • Definition 1: Trajectory Space
  • Definition 2: Demographic Groups and Conditional Distributions
  • Definition 3: Regional Partition and Demographic Composition
  • Definition 4: Aggregate Feature Map
  • Lemma 1: Uniqueness of group-level feature means
  • proof : Proof sketch
  • Lemma 2: Stability under aggregate perturbations
  • proof : Proof sketch
  • Lemma 3: Finite Sample Error Bound
  • Theorem 1: Overall bound (optimization + sampling)
  • ...and 18 more