Privacy-Preserving Visual Localization with Event Cameras
Junho Kim, Young Min Kim, Ramzi Zahreddine, Weston A. Welge, Gurunandan Krishnan, Sizhuo Ma, Jian Wang
TL;DR
This work addresses privacy-aware visual localization on resource-constrained edge devices by leveraging event cameras and a two-tier privacy framework. It combines event-to-image conversion with mature image-based localization to achieve robust 6-DoF pose estimation against challenging conditions, while protecting privacy through network-level split inference and sensor-level filtering. The authors introduce EvRooms and EvHumans datasets, conduct a user study, and demonstrate that privacy protections yield meaningful security gains with only modest localization degradation. Overall, the approach offers a practical, privacy-preserving building block for location-based services in mobile AR/VR contexts.
Abstract
We consider the problem of client-server localization, where edge device users communicate visual data with the service provider for locating oneself against a pre-built 3D map. This localization paradigm is a crucial component for location-based services in AR/VR or mobile applications, as it is not trivial to store large-scale 3D maps and process fast localization on resource-limited edge devices. Nevertheless, conventional client-server localization systems possess numerous challenges in computational efficiency, robustness, and privacy-preservation during data transmission. Our work aims to jointly solve these challenges with a localization pipeline based on event cameras. By using event cameras, our system consumes low energy and maintains small memory bandwidth. Then during localization, we propose applying event-to-image conversion and leverage mature image-based localization, which achieves robustness even in low-light or fast-moving scenes. To further enhance privacy protection, we introduce privacy protection techniques at two levels. Network level protection aims to hide the entire user's view in private scenes using a novel split inference approach, while sensor level protection aims to hide sensitive user details such as faces with light-weight filtering. Both methods involve small client-side computation and localization performance loss, while significantly mitigating the feeling of insecurity as revealed in our user study. We thus project our method to serve as a building block for practical location-based services using event cameras. Project page including the code is available through this link: https://82magnolia.github.io/event\_localization/.
