Efficient Greedy Discrete Subtrajectory Clustering

Ivor van der Hoog; Lara Ost; Eva Rotenberg; Daniel Rutschmann

Efficient Greedy Discrete Subtrajectory Clustering

Ivor van der Hoog, Lara Ost, Eva Rotenberg, Daniel Rutschmann

TL;DR

The work tackles subtrajectory clustering by forming Δ-clusters under the discrete Fréchet distance, enabling coherent groups of subtrajectories. It delivers optimized SC implementations and a greedy framework that uses SC as a subroutine, augmented by PSC, a 2-approximation Pareto-front algorithm over cluster size and length with time $O(n^2 \log^4 n)$. Empirical results show substantial improvements in runtime and memory relative to prior single-core methods on real and synthetic data, while achieving competitive clustering quality. The methods enable scalable map-construction and movement-pattern discovery across large trajectory collections.

Abstract

We cluster a set of trajectories T using subtrajectories of T. Clustering quality may be measured by the number of clusters, the number of vertices of T that are absent from the clustering, and by the Fréchet distance between subtrajectories in a cluster. A $Δ$-cluster of T is a cluster ${\mathcal{P}}$ of subtrajectories of T with a centre $P \in {\mathcal{P}}$ with complexity $\ell$, where all subtrajectories in ${\mathcal{P}}$ have Fréchet distance at most $Δ$ to $P$. Buchin, Buchin, Gudmundsson, Löffler and Luo present two $O(n^2 + n m \ell)$-time algorithms: SC($\max$, $\ell$, $Δ$, T) computes a single $Δ$-cluster where $P$ has at least $\ell$ vertices and maximises the cardinality $m$ of ${\mathcal{P}}$. SC($m$, $\max$, $Δ$, T) computes a single $Δ$-cluster where ${\mathcal{P}}$ has cardinality $m$ and maximises the complexity $\ell$ of $P$. We use such maximum-cardinality clusters in a greedy clustering algorithm. We provide an efficient implementation of SC($\max$, $\ell$, $Δ$, T) and SC($m$, $\max$, $Δ$, T) that significantly outperforms previous implementations. We use these functions as a subroutine in a greedy clustering algorithm, which performs well when compared to existing subtrajectory clustering algorithms on real-world data. Finally, we observe that, for fixed $Δ$ and T, these two functions always output a point on the Pareto front of some bivariate function $θ(\ell, m)$. We design a new algorithm PSC($Δ$, T) that in $O( n^2 \log^4 n)$ time computes a $2$-approximation of this Pareto front. This yields a broader set of candidate clusters, with comparable quality. We show that using PSC($Δ$, T) as a subroutine improves the clustering quality and performance even further.

Efficient Greedy Discrete Subtrajectory Clustering

TL;DR

. Empirical results show substantial improvements in runtime and memory relative to prior single-core methods on real and synthetic data, while achieving competitive clustering quality. The methods enable scalable map-construction and movement-pattern discovery across large trajectory collections.

Abstract

-cluster of T is a cluster

of subtrajectories of T with a centre

with complexity

, where all subtrajectories in

have Fréchet distance at most

. Buchin, Buchin, Gudmundsson, Löffler and Luo present two

-time algorithms: SC(

, T) computes a single

-cluster where

has at least

vertices and maximises the cardinality

. SC(

, T) computes a single

-cluster where

has cardinality

and maximises the complexity

. We use such maximum-cardinality clusters in a greedy clustering algorithm. We provide an efficient implementation of SC(

, T) and SC(

, T) that significantly outperforms previous implementations. We use these functions as a subroutine in a greedy clustering algorithm, which performs well when compared to existing subtrajectory clustering algorithms on real-world data. Finally, we observe that, for fixed

and T, these two functions always output a point on the Pareto front of some bivariate function

. We design a new algorithm PSC(

, T) that in

time computes a

-approximation of this Pareto front. This yields a broader set of candidate clusters, with comparable quality. We show that using PSC(

, T) as a subroutine improves the clustering quality and performance even further.

Efficient Greedy Discrete Subtrajectory Clustering

TL;DR

Abstract

Efficient Greedy Discrete Subtrajectory Clustering

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (39)

Theorems & Definitions (9)