Table of Contents
Fetching ...

Real-time Multi-modal Object Detection and Tracking on Edge for Regulatory Compliance Monitoring

Jia Syuen Lim, Ziwei Wang, Jiajun Liu, Abdelwahed Khamis, Reza Arablouei, Robert Barlow, Ryan McAllister

TL;DR

The paper tackles regulatory compliance monitoring in agrifood facilities by deploying a real-time, edge-based, multi-modal sensing system using a combined 3D ToF depth and RGB camera. It presents a ROS 2–driven pipeline that fuses prescan-based background modeling, KDE/HDBSCAN clustering, and SORT tracking to detect and continuously track knives in a sanitation bath, with a hand-detection module to prevent occlusion errors. The approach achieves near real-time performance and robust object separation under low-light and occlusion, assigning unique IDs and sanitation duration to indicate compliance. The work demonstrates practical impact for automated auditability and suggests extensions to other QA tasks and few-shot learning via pseudo-labeling.

Abstract

Regulatory compliance auditing across diverse industrial domains requires heightened quality assurance and traceability. Present manual and intermittent approaches to such auditing yield significant challenges, potentially leading to oversights in the monitoring process. To address these issues, we introduce a real-time, multi-modal sensing system employing 3D time-of-flight and RGB cameras, coupled with unsupervised learning techniques on edge AI devices. This enables continuous object tracking thereby enhancing efficiency in record-keeping and minimizing manual interventions. While we validate the system in a knife sanitization context within agrifood facilities, emphasizing its prowess against occlusion and low-light issues with RGB cameras, its potential spans various industrial monitoring settings.

Real-time Multi-modal Object Detection and Tracking on Edge for Regulatory Compliance Monitoring

TL;DR

The paper tackles regulatory compliance monitoring in agrifood facilities by deploying a real-time, edge-based, multi-modal sensing system using a combined 3D ToF depth and RGB camera. It presents a ROS 2–driven pipeline that fuses prescan-based background modeling, KDE/HDBSCAN clustering, and SORT tracking to detect and continuously track knives in a sanitation bath, with a hand-detection module to prevent occlusion errors. The approach achieves near real-time performance and robust object separation under low-light and occlusion, assigning unique IDs and sanitation duration to indicate compliance. The work demonstrates practical impact for automated auditability and suggests extensions to other QA tasks and few-shot learning via pseudo-labeling.

Abstract

Regulatory compliance auditing across diverse industrial domains requires heightened quality assurance and traceability. Present manual and intermittent approaches to such auditing yield significant challenges, potentially leading to oversights in the monitoring process. To address these issues, we introduce a real-time, multi-modal sensing system employing 3D time-of-flight and RGB cameras, coupled with unsupervised learning techniques on edge AI devices. This enables continuous object tracking thereby enhancing efficiency in record-keeping and minimizing manual interventions. While we validate the system in a knife sanitization context within agrifood facilities, emphasizing its prowess against occlusion and low-light issues with RGB cameras, its potential spans various industrial monitoring settings.
Paper Structure (21 sections, 4 figures, 1 table)

This paper contains 21 sections, 4 figures, 1 table.

Figures (4)

  • Figure 1: The overall architecture of our proposed system.
  • Figure 2: Point clouds of the objects of interest isolated from the background (left), density maps produced by KDE (center), and clusters identified by HDBSCAN (right).
  • Figure 3: Illustration showing the original image (left), the 2D projected heatmap (center), and the detection results (right)
  • Figure 4: Left: A web-based visualization system implemented using a Flask API. Right: A real-time 3D visualization of point clouds overlaid with density maps using RViz from ROS 2.