Table of Contents
Fetching ...

Trajectory Similarity Measurement: An Efficiency Perspective

Yanchuan Chang, Egemen Tanin, Gao Cong, Christian S. Jensen, Jianzhong Qi

TL;DR

The paper addresses the efficiency of trajectory similarity, comparing traditional non-learned measures with embedding-based learned measures across CPUs/GPUs and multiple tasks. It demonstrates that learned measures do not universally outperform non-learned ones in online similarity, unless embeddings are precomputed, whereas they can excel in offline clustering and kNN queries with vector indices. Among learned methods, self-attention-based models train rapidly and typically yield higher accuracy, though memory and embedding-cost trade-offs persist. The findings guide practitioners to choose between online, one-off computations and offline, bulk analyses, and suggest directions for improving accuracy guarantees and specialized indices.

Abstract

Trajectories that capture object movement have numerous applications, in which similarity computation between trajectories often plays a key role. Traditionally, the similarity between two trajectories is quantified by means of heuristic measures, e.g., Hausdorff or ERP, that operate directly on the trajectories. In contrast, recent studies exploit deep learning to map trajectories to d-dimensional vectors, called embeddings. Then, some distance measure, e.g., Manhattan or Euclidean, is applied to the embeddings to quantify trajectory similarity. The resulting similarities are inaccurate: they only approximate the similarities obtained using the heuristic measures. As distance computation on embeddings is efficient, focus has been on achieving embeddings yielding high accuracy. Adopting an efficiency perspective, we analyze the time complexities of both the heuristic and the learning-based approaches, finding that the time complexities of the former approaches are not necessarily higher. Through extensive experiments on open datasets, we find that, on both CPUs and GPUs, only a few learning-based approaches can deliver the promised higher efficiency, when the embeddings can be pre-computed, while heuristic approaches are more efficient for one-off computations. Among the learning-based approaches, the self-attention-based ones are the fastest to learn embeddings that also yield the highest accuracy for similarity queries. These results have implications for the use of trajectory similarity approaches given different application requirements.

Trajectory Similarity Measurement: An Efficiency Perspective

TL;DR

The paper addresses the efficiency of trajectory similarity, comparing traditional non-learned measures with embedding-based learned measures across CPUs/GPUs and multiple tasks. It demonstrates that learned measures do not universally outperform non-learned ones in online similarity, unless embeddings are precomputed, whereas they can excel in offline clustering and kNN queries with vector indices. Among learned methods, self-attention-based models train rapidly and typically yield higher accuracy, though memory and embedding-cost trade-offs persist. The findings guide practitioners to choose between online, one-off computations and offline, bulk analyses, and suggest directions for improving accuracy guarantees and specialized indices.

Abstract

Trajectories that capture object movement have numerous applications, in which similarity computation between trajectories often plays a key role. Traditionally, the similarity between two trajectories is quantified by means of heuristic measures, e.g., Hausdorff or ERP, that operate directly on the trajectories. In contrast, recent studies exploit deep learning to map trajectories to d-dimensional vectors, called embeddings. Then, some distance measure, e.g., Manhattan or Euclidean, is applied to the embeddings to quantify trajectory similarity. The resulting similarities are inaccurate: they only approximate the similarities obtained using the heuristic measures. As distance computation on embeddings is efficient, focus has been on achieving embeddings yielding high accuracy. Adopting an efficiency perspective, we analyze the time complexities of both the heuristic and the learning-based approaches, finding that the time complexities of the former approaches are not necessarily higher. Through extensive experiments on open datasets, we find that, on both CPUs and GPUs, only a few learning-based approaches can deliver the promised higher efficiency, when the embeddings can be pre-computed, while heuristic approaches are more efficient for one-off computations. Among the learning-based approaches, the self-attention-based ones are the fastest to learn embeddings that also yield the highest accuracy for similarity queries. These results have implications for the use of trajectory similarity approaches given different application requirements.
Paper Structure (38 sections, 17 figures, 14 tables, 1 algorithm)

This paper contains 38 sections, 17 figures, 14 tables, 1 algorithm.

Figures (17)

  • Figure 1: Computation of non-learned measures (dotted lines indicate point-to-point distance computation)
  • Figure 2: Computation of learned measures
  • Figure 3: Representative trajectory similarity measures plotted in chronological order
  • Figure 4: Computation of linear scan-based measures
  • Figure 5: RNN-based learned measures
  • ...and 12 more figures