Quantifying Point Contributions: A Lightweight Framework for Efficient and Effective Query-Driven Trajectory Simplification
Yumeng Song, Yu Gu, Tianyi Li, Yushuai Li, Christian S. Jensen, Ge Yu
TL;DR
MLSimp tackles the challenge of efficient, query-aware trajectory simplification by coupling a graph neural network–based point-importance predictor (GNN-TS) with a diffusion-based generator (Diff-TS) in a mutual-learning framework. It introduces globality and uniqueness as formal metrics to capture global structure and local distinctiveness, enabling non-iterative, globally informed point retention and workload-aligned adjustments. Through alternating training, high-compression signals from Diff-TS sharpen GNN-TS’s predictions and vice versa, yielding simplified trajectories that preserve query accuracy while reducing processing time. Experiments on Geolife, T-Drive, and OSM show substantial speedups (42%–70% in simplification time) and notable gains in range, kNN, similarity, and clustering query performance, proving MLSimp’s practical impact for large-scale trajectory databases.
Abstract
As large volumes of trajectory data accumulate, simplifying trajectories to reduce storage and querying costs is increasingly studied. Existing proposals face three main problems. First, they require numerous iterations to decide which GPS points to delete. Second, they focus only on the relationships between neighboring points (local information) while neglecting the overall structure (global information), reducing the global similarity between the simplified and original trajectories and making it difficult to maintain consistency in query results, especially for similarity-based queries. Finally, they fail to differentiate the importance of points with similar features, leading to suboptimal selection of points to retain the original trajectory information. We propose MLSimp, a novel Mutual Learning query-driven trajectory simplification framework that integrates two distinct models: GNN-TS, based on graph neural networks, and Diff-TS, based on diffusion models. GNN-TS evaluates the importance of a point according to its globality, capturing its correlation with the entire trajectory, and its uniqueness, capturing its differences from neighboring points. It also incorporates attention mechanisms in the GNN layers, enabling simultaneous data integration from all points within the same trajectory and refining representations, thus avoiding iterative processes. Diff-TS generates amplified signals to enable the retention of the most important points at low compression rates. Experiments involving eight baselines on three databases show that MLSimp reduces the simplification time by 42%--70% and improves query accuracy over simplified trajectories by up to 34.6%.
