Performance Evaluation of GPS Trajectory Rasterization Methods
Necip Enes Gengec, Ergin Tari
TL;DR
This work addresses efficient rasterization of GPS trajectory data into structured grids for downstream analysis, comparing QGIS, PostGIS+QGIS, and Python-based approaches (including a parallelized variant). The authors implement and evaluate four rasterization flows that compute three attributes—pixel-wise count, average speed, and maximum speed—at a 5 m pixel resolution using the Montreal MTL-Trajet dataset, across varying area sizes and point counts. The study finds that the Python (Parallel) method delivers the best overall performance, while QGIS performs poorly and PostGIS+QGIS excels at spatial joins but suffers from grid creation and data-import overheads, causing total-time degradation as problem size grows. The results provide practical guidance for selecting rasterization strategies in big GPS datasets and suggest that the Python-based parallel approach scales well to large-area analyses and can be integrated with existing GIS workflows for broader applicability. $t_{Total}$ and $t_{Spatial Join}$ serve as the primary metrics, highlighting the trade-offs between in-memory/grid creation costs and database-assisted operations.$
Abstract
The availability of the Global Positioning System (GPS) trajectory data is increasing along with the availability of different GPS receivers and with the increasing use of various mobility services. GPS trajectory is an important data source which is used in traffic density detection, transport mode detection, mapping data inferences with the use of different methods such as image processing and machine learning methods. While the data size increases, efficient representation of this type of data is becoming difficult to be used in these methods. A common approach is the representation of GPS trajectory information such as average speed, bearing, etc. in raster image form and applying analysis methods. In this study, we evaluate GPS trajectory data rasterization using the spatial join functions of QGIS, PostGIS+QGIS, and our iterative spatial structured grid aggregation implementation coded in the Python programming language. Our implementation is also parallelizable, and this parallelization is also included as the fourth method. According to the results of experiment carried out with an example GPS trajectory dataset, QGIS method and PostGIS+QGIS method showed relatively low performance with respect to our method using the metric of total processing time. PostGIS+QGIS method achieved the best results for spatial join though its total performance decreased quickly while test area size increases. On the other hand, both of our methods' performances decrease directly proportional to GPS point. And our methods' performance can be increased proportional to the increase with the number of processor cores and/or with multiple computing clusters.
