Enabling Cross-Camera Collaboration for Video Analytics on Distributed Smart Cameras
Chulhong Min, Juheon Yi, Utku Gunay Acer, Fahim Kawsar
TL;DR
Argus tackles the challenge of real-time multi-camera multi-target tracking in overlapping camera deployments by moving computation onto distributed smart cameras and introducing object-wise spatio-temporal association to avoid redundant identifications. The approach combines a dynamic inspector that orders camera and bounding-box inspections with a workload distributor that executes identification tasks in parallel across cameras, all without cloud support. Evaluations on CityFlowV2, CAMPUS, and MMPTRACK demonstrate up to 7.13x fewer identifications and up to 2.19x latency reductions while preserving MOTP/MOTA, validating the practicality of cloud-free, on-device cross-camera collaboration. The work highlights the potential for scalable, privacy-preserving video analytics, with future directions including non-overlapping networks, alternative coordination topologies, and model-splitting to further enhance efficiency.
Abstract
Overlapping cameras offer exciting opportunities to view a scene from different angles, allowing for more advanced, comprehensive and robust analysis. However, existing visual analytics systems for multi-camera streams are mostly limited to (i) per-camera processing and aggregation and (ii) workload-agnostic centralized processing architectures. In this paper, we present Argus, a distributed video analytics system with cross-camera collaboration on smart cameras. We identify multi-camera, multi-target tracking as the primary task of multi-camera video analytics and develop a novel technique that avoids redundant, processing-heavy identification tasks by leveraging object-wise spatio-temporal association in the overlapping fields of view across multiple cameras. We further develop a set of techniques to perform these operations across distributed cameras without cloud support at low latency by (i) dynamically ordering the camera and object inspection sequence and (ii) flexibly distributing the workload across smart cameras, taking into account network transmission and heterogeneous computational capacities. Evaluation of three real-world overlapping camera datasets with two Nvidia Jetson devices shows that Argus reduces the number of object identifications and end-to-end latency by up to 7.13x and 2.19x (4.86x and 1.60x compared to the state-of-the-art), while achieving comparable tracking quality.
