TOD: Transprecise Object Detection to Maximise Real-Time Accuracy on the Edge
JunKyu Lee, Blesson Varghese, Roger Woods, Hans Vandierendonck
TL;DR
Edge devices face a fundamental trade-off between inference accuracy and real-time frame rates. TOD Tackles this by a transprecise, on-the-fly DNN selection mechanism that uses lightweight runtime features (such as bounding-box medians and motion cues) to switch among four TensorRT-optimised YOLO variants with negligible overhead. The work introduces a low-overhead hyperparameter-driven scheduler (MBBS-based) and demonstrates substantial accuracy improvements and efficiency benefits on Jetson Nano across MOT17Det datasets, including significant reductions in GPU power while preserving or approaching the best single-model accuracy. This approach enables more effective real-time object detection at the edge and lays groundwork for broader scheduling of distributed streaming workloads in fog and edge environments.
Abstract
Real-time video analytics on the edge is challenging as the computationally constrained resources typically cannot analyse video streams at full fidelity and frame rate, which results in loss of accuracy. This paper proposes a Transprecise Object Detector (TOD) which maximises the real-time object detection accuracy on an edge device by selecting an appropriate Deep Neural Network (DNN) on the fly with negligible computational overhead. TOD makes two key contributions over the state of the art: (1) TOD leverages characteristics of the video stream such as object size and speed of movement to identify networks with high prediction accuracy for the current frames; (2) it selects the best-performing network based on projected accuracy and computational demand using an effective and low-overhead decision mechanism. Experimental evaluation on a Jetson Nano demonstrates that TOD improves the average object detection precision by 34.7 % over the YOLOv4-tiny-288 model on average over the MOT17Det dataset. In the MOT17-05 test dataset, TOD utilises only 45.1 % of GPU resource and 62.7 % of the GPU board power without losing accuracy, compared to YOLOv4-416 model. We expect that TOD will maximise the application of edge devices to real-time object detection, since TOD maximises real-time object detection accuracy given edge devices according to dynamic input features without increasing inference latency in practice.
