CorVS: Person Identification via Video Trajectory-Sensor Correspondence in a Real-World Warehouse
Kazuma Kano, Yuki Mori, Shin Katayama, Kenta Urano, Takuro Yonezawa, Nobuo Kawaguchi
TL;DR
The paper tackles identity-aware localization in warehouses where appearance-based identification is unreliable, proposing CorVS, a two-stage method that first predicts correspondence probabilities and activity-based reliabilities between visual trajectories and wearable sensor measurements, then incrementally matches trajectories to sensor data. The approach leverages a DualCNN-Transformer–based correspondence estimator and a flexible matching algorithm that works under real-world conditions with multiple participants and intermittent sensor wearers. A real-world warehouse dataset with 29 workers and 19 ceiling cameras is used to train and evaluate the method, demonstrating substantial improvements over a baseline and revealing trade-offs between longer temporal windows and recall in practical settings. The work advances industry-scale localization by integrating fixed-camera data with inertial sensing, enabling robust, privacy-preserving identity association in dynamic warehouse environments.
Abstract
Worker location data is key to higher productivity in industrial sites. Cameras are a promising tool for localization in logistics warehouses since they also offer valuable environmental contexts such as package status. However, identifying individuals with only visual data is often impractical. Accordingly, several prior studies identified people in videos by comparing their trajectories and wearable sensor measurements. While this approach has advantages such as independence from appearance, the existing methods may break down under real-world conditions. To overcome this challenge, we propose CorVS, a novel data-driven person identification method based on correspondence between visual tracking trajectories and sensor measurements. Firstly, our deep learning model predicts correspondence probabilities and reliabilities for every pair of a trajectory and sensor measurements. Secondly, our algorithm matches the trajectories and sensor measurements over time using the predicted probabilities and reliabilities. We developed a dataset with actual warehouse operations and demonstrated the method's effectiveness for real-world applications.
