HoWDe: a validated algorithm for Home and Work location Detection
Sílvia De Sojo, Lorenzo Lucchini, Ollin D. Langle-Chimal, Samuel P. Fraiberger, Laura Alessandretti
TL;DR
HoWDe addresses the lack of robust, reproducible home/work detection from smartphone GPS data by introducing an open-source, modular pipeline that explicitly handles missing data and varying sampling. Ground-truth datasets (D1 and D2) validate the method, achieving up to 97.3% home and 88.1% work detection in D1 and substantial but lower accuracy in D2 due to pandemic-related mobility changes. The approach uses a sliding-window, fraction-based scheme with a small set of interpretable parameters, enabling robust detection across demographics and geographies and allowing downstream analyses of employment rates and commuting patterns. By providing validated code and privacy-preserving data practices, HoWDe promotes standardization, comparability, and responsible data sharing in human mobility research.
Abstract
Smartphone location data have become a key resource for understanding urban mobility, yet extracting actionable insights requires robust and reproducible preprocessing pipelines. A central step is the identification of individuals' home and work locations, which underpins analyses of commuting, employment, accessibility, and socioeconomic patterns. However, existing approaches are often ad hoc, data-specific, and difficult to reproduce, limiting comparability across studies and datasets. We introduce HoWDe, an open-source software library for detecting home and work locations from large-scale mobility data. HoWDe implements a transparent, modular pipeline explicitly designed to handle missing data, heterogeneous sampling rates, and differences in data sparsity across individuals. The code allows users to tune a small set of interpretable parameters, enabling to adapt the algorithm to diverse applications and datasets. Using two unique ground truth datasets comprising 5,099 individuals across 68 countries, we show that HoWDe achieves home and work detection accuracies of up to 97% and 88%, respectively, with consistent performance across demographic groups and geographic contexts. We further demonstrate how parameter settings propagate to downstream metrics such as employment estimates and commuting flows, highlighting the importance of transparent methodological choices. By providing a validated, documented, and easily deployable pipeline, HoWDe supports scalable in-house preprocessing and facilitates the sharing of privacy-preserving mobility datasets. Our software and evaluation benchmarks establish methodological standards that enhance the robustness and reproducibility of human mobility research at urban and national scales.
