Static vs. Dynamic Databases for Indoor Localization based on Wi-Fi Fingerprinting: A Discussion from a Data Perspective
Zhe Tang, Ruocheng Gu, Sihao Li, Kyeong Soo Kim, Jeremy S. Smith
TL;DR
This work addresses the challenge that static Wi-Fi fingerprint databases do not capture temporal variability in indoor RSSI, which degrades localization robustness. It proposes a data-centric perspective and demonstrates a dynamic RSSI database collected over 44 days in the XJTLU IR building to quantify temporal shifts using variance analysis and anomaly detection. The study shows that localization accuracy can deteriorate over time, with Gaussian Process regression errors reaching up to 6.65 m after 14 days without retraining, underscoring the limitations of static databases. The authors advocate dynamic databases as a practical path forward and provide guidelines for cost-effective construction using local coordinates and hybrid anchor-device measurements to support real-world deployment.
Abstract
Wi-Fi fingerprinting has emerged as the most popular approach to indoor localization. The use of ML algorithms has greatly improved the localization performance of Wi-Fi fingerprinting, but its success depends on the availability of fingerprint databases composed of a large number of RSSIs, the MAC addresses of access points, and the other measurement information. However, most fingerprint databases do not reflect well the time varying nature of electromagnetic interferences in complicated modern indoor environment. This could result in significant changes in statistical characteristics of training/validation and testing datasets, which are often constructed at different times, and even the characteristics of the testing datasets could be different from those of the data submitted by users during the operation of localization systems after their deployment. In this paper, we consider the implications of time-varying Wi-Fi fingerprints on indoor localization from a data-centric point of view and discuss the differences between static and dynamic databases. As a case study, we have constructed a dynamic database covering three floors of the IR building of XJTLU based on RSSI measurements, over 44 days, and investigated the differences between static and dynamic databases in terms of statistical characteristics and localization performance. The analyses based on variance calculations and Isolation Forest show the temporal shifts in RSSIs, which result in a noticeable trend of the increase in the localization error of a Gaussian process regression model with the maximum error of 6.65 m after 14 days of training without model adjustments. The results of the case study with the XJTLU dynamic database clearly demonstrate the limitations of static databases and the importance of the creation and adoption of dynamic databases for future indoor localization research and real-world deployment.
