Real-time Multi-modal Object Detection and Tracking on Edge for Regulatory Compliance Monitoring
Jia Syuen Lim, Ziwei Wang, Jiajun Liu, Abdelwahed Khamis, Reza Arablouei, Robert Barlow, Ryan McAllister
TL;DR
The paper tackles regulatory compliance monitoring in agrifood facilities by deploying a real-time, edge-based, multi-modal sensing system using a combined 3D ToF depth and RGB camera. It presents a ROS 2–driven pipeline that fuses prescan-based background modeling, KDE/HDBSCAN clustering, and SORT tracking to detect and continuously track knives in a sanitation bath, with a hand-detection module to prevent occlusion errors. The approach achieves near real-time performance and robust object separation under low-light and occlusion, assigning unique IDs and sanitation duration to indicate compliance. The work demonstrates practical impact for automated auditability and suggests extensions to other QA tasks and few-shot learning via pseudo-labeling.
Abstract
Regulatory compliance auditing across diverse industrial domains requires heightened quality assurance and traceability. Present manual and intermittent approaches to such auditing yield significant challenges, potentially leading to oversights in the monitoring process. To address these issues, we introduce a real-time, multi-modal sensing system employing 3D time-of-flight and RGB cameras, coupled with unsupervised learning techniques on edge AI devices. This enables continuous object tracking thereby enhancing efficiency in record-keeping and minimizing manual interventions. While we validate the system in a knife sanitization context within agrifood facilities, emphasizing its prowess against occlusion and low-light issues with RGB cameras, its potential spans various industrial monitoring settings.
