Table of Contents
Fetching ...

RED: Effective Trajectory Representation Learning with Comprehensive Information

Silin Zhou, Shuo Shang, Lisi Chen, Christian S. Jensen, Panos Kalnis

TL;DR

RED tackles the core challenge of trajectory representation learning by leveraging comprehensive trajectory information through a Transformer-based masked autoencoder. It introduces road-aware masking, a spatial-temporal-user joint embedding, and dual-objective training (next-segment prediction and trajectory reconstruction) to produce high-quality trajectory vectors. The enhanced Transformer uses virtual tokens and time-distance aware attention to better capture spatial-temporal dependencies, yielding substantial improvements in travel-time estimation, trajectory classification, similarity computation, and retrieval across three real-world datasets. Overall, RED delivers state-of-the-art TRL performance with favorable efficiency, demonstrating the value of integrating road, user, spatial, temporal, and movement semantics in trajectory modeling.

Abstract

Trajectory representation learning (TRL) maps trajectories to vectors that can then be used for various downstream tasks, including trajectory similarity computation, trajectory classification, and travel-time estimation. However, existing TRL methods often produce vectors that, when used in downstream tasks, yield insufficiently accurate results. A key reason is that they fail to utilize the comprehensive information encompassed by trajectories. We propose a self-supervised TRL framework, called RED, which effectively exploits multiple types of trajectory information. Overall, RED adopts the Transformer as the backbone model and masks the constituting paths in trajectories to train a masked autoencoder (MAE). In particular, RED considers the moving patterns of trajectories by employing a Road-aware masking strategy} that retains key paths of trajectories during masking, thereby preserving crucial information of the trajectories. RED also adopts a spatial-temporal-user joint Embedding scheme to encode comprehensive information when preparing the trajectories as model inputs. To conduct training, RED adopts Dual-objective task learning}: the Transformer encoder predicts the next segment in a trajectory, and the Transformer decoder reconstructs the entire trajectory. RED also considers the spatial-temporal correlations of trajectories by modifying the attention mechanism of the Transformer. We compare RED with 9 state-of-the-art TRL methods for 4 downstream tasks on 3 real-world datasets, finding that RED can usually improve the accuracy of the best-performing baseline by over 5%.

RED: Effective Trajectory Representation Learning with Comprehensive Information

TL;DR

RED tackles the core challenge of trajectory representation learning by leveraging comprehensive trajectory information through a Transformer-based masked autoencoder. It introduces road-aware masking, a spatial-temporal-user joint embedding, and dual-objective training (next-segment prediction and trajectory reconstruction) to produce high-quality trajectory vectors. The enhanced Transformer uses virtual tokens and time-distance aware attention to better capture spatial-temporal dependencies, yielding substantial improvements in travel-time estimation, trajectory classification, similarity computation, and retrieval across three real-world datasets. Overall, RED delivers state-of-the-art TRL performance with favorable efficiency, demonstrating the value of integrating road, user, spatial, temporal, and movement semantics in trajectory modeling.

Abstract

Trajectory representation learning (TRL) maps trajectories to vectors that can then be used for various downstream tasks, including trajectory similarity computation, trajectory classification, and travel-time estimation. However, existing TRL methods often produce vectors that, when used in downstream tasks, yield insufficiently accurate results. A key reason is that they fail to utilize the comprehensive information encompassed by trajectories. We propose a self-supervised TRL framework, called RED, which effectively exploits multiple types of trajectory information. Overall, RED adopts the Transformer as the backbone model and masks the constituting paths in trajectories to train a masked autoencoder (MAE). In particular, RED considers the moving patterns of trajectories by employing a Road-aware masking strategy} that retains key paths of trajectories during masking, thereby preserving crucial information of the trajectories. RED also adopts a spatial-temporal-user joint Embedding scheme to encode comprehensive information when preparing the trajectories as model inputs. To conduct training, RED adopts Dual-objective task learning}: the Transformer encoder predicts the next segment in a trajectory, and the Transformer decoder reconstructs the entire trajectory. RED also considers the spatial-temporal correlations of trajectories by modifying the attention mechanism of the Transformer. We compare RED with 9 state-of-the-art TRL methods for 4 downstream tasks on 3 real-world datasets, finding that RED can usually improve the accuracy of the best-performing baseline by over 5%.

Paper Structure

This paper contains 22 sections, 7 equations, 7 figures, 11 tables.

Figures (7)

  • Figure 1: Overall architecture of the RED framework.
  • Figure 2: Trajectory sample statistics of the Porto dataset.
  • Figure 3: Illustration of the road-aware masking.
  • Figure 4: The enhanced Transformer of the encoder and decoder.
  • Figure 5: Trajectory similarity computation time (in s) when varying the trajectory length and dataset size.
  • ...and 2 more figures