Table of Contents
Fetching ...

Context-Enhanced Multi-View Trajectory Representation Learning: Bridging the Gap through Self-Supervised Models

Tangwen Qian, Junhe Li, Yile Chen, Gao Cong, Tao Sun, Fei Wang, Yongjun Xu

TL;DR

MVTraj integrates diverse contextual knowledge, from GPS to road network and points-of-interest to provide a more comprehensive understanding of trajectory data and significantly outperforms existing baselines in tasks associated with various spatial views, validating its effectiveness and practical utility in spatio-temporal modeling.

Abstract

Modeling trajectory data with generic-purpose dense representations has become a prevalent paradigm for various downstream applications, such as trajectory classification, travel time estimation and similarity computation. However, existing methods typically rely on trajectories from a single spatial view, limiting their ability to capture the rich contextual information that is crucial for gaining deeper insights into movement patterns across different geospatial contexts. To this end, we propose MVTraj, a novel multi-view modeling method for trajectory representation learning. MVTraj integrates diverse contextual knowledge, from GPS to road network and points-of-interest to provide a more comprehensive understanding of trajectory data. To align the learning process across multiple views, we utilize GPS trajectories as a bridge and employ self-supervised pretext tasks to capture and distinguish movement patterns across different spatial views. Following this, we treat trajectories from different views as distinct modalities and apply a hierarchical cross-modal interaction module to fuse the representations, thereby enriching the knowledge derived from multiple sources. Extensive experiments on real-world datasets demonstrate that MVTraj significantly outperforms existing baselines in tasks associated with various spatial views, validating its effectiveness and practical utility in spatio-temporal modeling.

Context-Enhanced Multi-View Trajectory Representation Learning: Bridging the Gap through Self-Supervised Models

TL;DR

MVTraj integrates diverse contextual knowledge, from GPS to road network and points-of-interest to provide a more comprehensive understanding of trajectory data and significantly outperforms existing baselines in tasks associated with various spatial views, validating its effectiveness and practical utility in spatio-temporal modeling.

Abstract

Modeling trajectory data with generic-purpose dense representations has become a prevalent paradigm for various downstream applications, such as trajectory classification, travel time estimation and similarity computation. However, existing methods typically rely on trajectories from a single spatial view, limiting their ability to capture the rich contextual information that is crucial for gaining deeper insights into movement patterns across different geospatial contexts. To this end, we propose MVTraj, a novel multi-view modeling method for trajectory representation learning. MVTraj integrates diverse contextual knowledge, from GPS to road network and points-of-interest to provide a more comprehensive understanding of trajectory data. To align the learning process across multiple views, we utilize GPS trajectories as a bridge and employ self-supervised pretext tasks to capture and distinguish movement patterns across different spatial views. Following this, we treat trajectories from different views as distinct modalities and apply a hierarchical cross-modal interaction module to fuse the representations, thereby enriching the knowledge derived from multiple sources. Extensive experiments on real-world datasets demonstrate that MVTraj significantly outperforms existing baselines in tasks associated with various spatial views, validating its effectiveness and practical utility in spatio-temporal modeling.

Paper Structure

This paper contains 31 sections, 15 equations, 3 figures, 6 tables.

Figures (3)

  • Figure 1: Motivation of multi-view trajectory representation learning. Given that GPS trajectories provide the highest sampling frequency and spatial resolution, they serve as a bridge to exhibit trajectories with various contexts, which facilitates representation learning across multiple views.
  • Figure 2: An overview of MVTraj. MVTraj consists of two key modules: a multi-view trajectory integration module, and a hierarchical multi-modal interaction module.
  • Figure 3: Effect of Pre-training in Travel Time Estimation.

Theorems & Definitions (4)

  • Definition 1
  • Definition 2
  • Definition 3
  • Definition 4