Table of Contents
Fetching ...

ProDiff: Prototype-Guided Diffusion for Minimal Information Trajectory Imputation

Tianci Bu, Le Zhou, Wenchuan Yang, Jianhong Mou, Kang Yang, Suoyi Tan, Feng Yao, Jingyuan Wang, Xin Lu

TL;DR

ProDiff addresses the challenge of imputing missing trajectory data under minimal information by jointly learning a diffusion-based generator and a prototype-conditioned guidance mechanism. Through a Prototype Condition Extractor, the model embeds macro movement patterns into a latent space and aligns these patterns with endpoint information during diffusion denoising. Empirical evaluations on WuXi and FourSquare demonstrate state-of-the-art imputation accuracy and a strong correlation between generated and real trajectories (approximately $0.93$), validating the approach’s effectiveness for urban mobility analysis. The framework offers a scalable, privacy-conscious solution that leverages large-scale unlabeled trajectory patterns to improve reconstruction under sparse observation, with potential extensions to personalized and uncertainty-aware trajectory generation.

Abstract

Trajectory data is crucial for various applications but often suffers from incompleteness due to device limitations and diverse collection scenarios. Existing imputation methods rely on sparse trajectory or travel information, such as velocity, to infer missing points. However, these approaches assume that sparse trajectories retain essential behavioral patterns, which place significant demands on data acquisition and overlook the potential of large-scale human trajectory embeddings. To address this, we propose ProDiff, a trajectory imputation framework that uses only two endpoints as minimal information. It integrates prototype learning to embed human movement patterns and a denoising diffusion probabilistic model for robust spatiotemporal reconstruction. Joint training with a tailored loss function ensures effective imputation. ProDiff outperforms state-of-the-art methods, improving accuracy by 6.28\% on FourSquare and 2.52\% on WuXi. Further analysis shows a 0.927 correlation between generated and real trajectories, demonstrating the effectiveness of our approach.

ProDiff: Prototype-Guided Diffusion for Minimal Information Trajectory Imputation

TL;DR

ProDiff addresses the challenge of imputing missing trajectory data under minimal information by jointly learning a diffusion-based generator and a prototype-conditioned guidance mechanism. Through a Prototype Condition Extractor, the model embeds macro movement patterns into a latent space and aligns these patterns with endpoint information during diffusion denoising. Empirical evaluations on WuXi and FourSquare demonstrate state-of-the-art imputation accuracy and a strong correlation between generated and real trajectories (approximately ), validating the approach’s effectiveness for urban mobility analysis. The framework offers a scalable, privacy-conscious solution that leverages large-scale unlabeled trajectory patterns to improve reconstruction under sparse observation, with potential extensions to personalized and uncertainty-aware trajectory generation.

Abstract

Trajectory data is crucial for various applications but often suffers from incompleteness due to device limitations and diverse collection scenarios. Existing imputation methods rely on sparse trajectory or travel information, such as velocity, to infer missing points. However, these approaches assume that sparse trajectories retain essential behavioral patterns, which place significant demands on data acquisition and overlook the potential of large-scale human trajectory embeddings. To address this, we propose ProDiff, a trajectory imputation framework that uses only two endpoints as minimal information. It integrates prototype learning to embed human movement patterns and a denoising diffusion probabilistic model for robust spatiotemporal reconstruction. Joint training with a tailored loss function ensures effective imputation. ProDiff outperforms state-of-the-art methods, improving accuracy by 6.28\% on FourSquare and 2.52\% on WuXi. Further analysis shows a 0.927 correlation between generated and real trajectories, demonstrating the effectiveness of our approach.

Paper Structure

This paper contains 29 sections, 1 theorem, 23 equations, 6 figures, 9 tables, 2 algorithms.

Key Result

Theorem 3.4

Any global optimum $(f^*, \{p_k^*\})$ satisfies:

Figures (6)

  • Figure 1: Comparison of traditional and proposed trajectory imputation. Traditional methods preserve movement patterns but impose device constraints and rely on predefined graphs. Our approach directly embeds trajectories into vector space for minimal information imputation.
  • Figure 2: Left illustrates how prototype learning and diffusion models interact. The diffusion process progressively corrupts trajectories with Gaussian noise, preserving only the endpoints, while prototype learning embeds trajectories and extracts patterns. During denoising, prototype-based conditions, combined with endpoint features, guide the diffusion model. A joint loss function optimizes both components, ensuring effective trajectory reconstruction. Right is the architecture of the diffusion base model.
  • Figure 3: Composition of prototype condition extractor and its workflow during the training and test (black and blue lines).
  • Figure 4: a. Radar charts illustrate the normalized performance of different models across six distinct metrics. b. Histogram comparing the performance of each model across different metrics, with dashed lines indicating the best-performing model's values for each metric.
  • Figure 5: Trajectory data representation after dimensionality reduction by PaCMAP, randomly selected samples and neighboring samples plot trajectories to interpret human trajectory patterns captured by prototype learning.
  • ...and 1 more figures

Theorems & Definitions (5)

  • Definition 3.1
  • Definition 3.2
  • Definition 3.3
  • Theorem 3.4: The Optimality of Prototype Learning
  • proof