Comparison of home detection algorithms using smartphone GPS data
Rajat Verma, Shagun Mittal, Zengxiang Lei, Xiaowei Chen, Satish V. Ukkusuri
TL;DR
Estimating home locations from smartphone GPS data is essential for large-scale human mobility analyses but lacks systematic evaluation of HDAs. The authors compare five HDAs, including a novel $A_4$, across eight GPS datasets from four U.S. cities using three proxy metrics $M_1$, $M_2$, and $M_3$, and also analyze downstream impacts. They show that temporal and spatial continuity of data points matters more than data size for accurate home detection, and that HDA choice can materially alter Evacuation and SES-related inferences. The study provides metric-driven guidance for selecting HDAs to improve transparency and reliability in mobility research and policy assessment.
Abstract
Estimation of people's home locations using location-based services data from smartphones is a common task in human mobility assessment. However, commonly used home detection algorithms (HDAs) are often arbitrary and unexamined. In this study, we review existing HDAs and examine five HDAs using eight high-quality mobile phone geolocation datasets. These include four commonly used HDAs as well as an HDA proposed in this work. To make quantitative comparisons, we propose three novel metrics to assess the quality of detected home locations and test them on eight datasets across four U.S. cities. We find that all three metrics show a consistent rank of HDAs' performances, with the proposed HDA outperforming the others. We infer that the temporal and spatial continuity of the geolocation data points matters more than the overall size of the data for accurate home detection. We also find that HDAs with high (and similar) performance metrics tend to create results with better consistency and closer to common expectations. Further, the performance deteriorates with decreasing data quality of the devices, though the patterns of relative performance persist. Finally, we show how the differences in home detection can lead to substantial differences in subsequent inferences using two case studies - (i) hurricane evacuation estimation, and (ii) correlation of mobility patterns with socioeconomic status. Our work contributes to improving the transparency of large-scale human mobility assessment applications.
