Table of Contents
Fetching ...

AyE-Edge: Automated Deployment Space Search Empowering Accuracy yet Efficient Real-Time Object Detection on the Edge

Chao Wu, Yifan Gong, Liangkai Liu, Mengquan Li, Yushu Wu, Xuan Shen, Zhimin Li, Geng Yuan, Weisong Shi, Yanzhi Wang

TL;DR

AyE-Edge tackles the fundamental trade-off in edge object detection among accuracy, latency, and power by introducing an automated deployment-space search across data, model, and hardware layers. It combines a tripartite framework—an optimized deployment space with T-Locality keyframes, a performance collector with a latency predictor, and a MARL-driven coordinator—to discover Pareto-optimal configurations under real-time constraints. The approach yields substantial power savings (up to 96.7% reductions against SOTA baselines) while maintaining competitive accuracy and throughput on mobile hardware. This work introduces a first-of-its-kind, coordinated pipeline for real-time Edge-OD that jointly optimizes input framing, model pruning, and DVFS across CPU and GPU, validated on real devices and public datasets, with practical implications for energy-aware edge AI systems.

Abstract

Object detection on the edge (Edge-OD) is in growing demand thanks to its ever-broad application prospects. However, the development of this field is rigorously restricted by the deployment dilemma of simultaneously achieving high accuracy, excellent power efficiency, and meeting strict real-time requirements. To tackle this dilemma, we propose AyE-Edge, the first-of-this-kind development tool that explores automated algorithm-device deployment space search to realize Accurate yet power-Efficient real-time object detection on the Edge. Through a collaborative exploration of keyframe selection, CPU-GPU configuration, and DNN pruning strategy, AyE-Edge excels in extensive real-world experiments conducted on a mobile device. The results consistently demonstrate AyE-Edge's effectiveness, realizing outstanding real-time performance, detection accuracy, and notably, a remarkable 96.7% reduction in power consumption, compared to state-of-the-art (SOTA) competitors.

AyE-Edge: Automated Deployment Space Search Empowering Accuracy yet Efficient Real-Time Object Detection on the Edge

TL;DR

AyE-Edge tackles the fundamental trade-off in edge object detection among accuracy, latency, and power by introducing an automated deployment-space search across data, model, and hardware layers. It combines a tripartite framework—an optimized deployment space with T-Locality keyframes, a performance collector with a latency predictor, and a MARL-driven coordinator—to discover Pareto-optimal configurations under real-time constraints. The approach yields substantial power savings (up to 96.7% reductions against SOTA baselines) while maintaining competitive accuracy and throughput on mobile hardware. This work introduces a first-of-its-kind, coordinated pipeline for real-time Edge-OD that jointly optimizes input framing, model pruning, and DVFS across CPU and GPU, validated on real devices and public datasets, with practical implications for energy-aware edge AI systems.

Abstract

Object detection on the edge (Edge-OD) is in growing demand thanks to its ever-broad application prospects. However, the development of this field is rigorously restricted by the deployment dilemma of simultaneously achieving high accuracy, excellent power efficiency, and meeting strict real-time requirements. To tackle this dilemma, we propose AyE-Edge, the first-of-this-kind development tool that explores automated algorithm-device deployment space search to realize Accurate yet power-Efficient real-time object detection on the Edge. Through a collaborative exploration of keyframe selection, CPU-GPU configuration, and DNN pruning strategy, AyE-Edge excels in extensive real-world experiments conducted on a mobile device. The results consistently demonstrate AyE-Edge's effectiveness, realizing outstanding real-time performance, detection accuracy, and notably, a remarkable 96.7% reduction in power consumption, compared to state-of-the-art (SOTA) competitors.
Paper Structure (8 sections, 3 equations, 6 figures, 4 tables)

This paper contains 8 sections, 3 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: (a) the dilemma of achieving high accuracy, excellent power efficiency, and meeting strict real-time requirements; The impact of keyframe selection strategies, DNN model pruning methods, and DVFS techniques on (b) detection accuracy, (c) power consumption and (d) real-time performance.
  • Figure 2: The architecture for object detection on the edge.
  • Figure 3: The proposed AyE-Edge development tool.
  • Figure 4: (a) SSIM features of all the frames from a video clip based on YOLO-v5 detectors with the BDD100K dataset; (b) The mAP comparison among methods.
  • Figure 5: The predicted latencies of our Latency Predictor in Eq. (\ref{['eq:prune']}) across (a) various CPU V/F levels with GPU fixed with the highest GPU frequency and (b) various GPU V/F levels with CPUs fixed with the highest frequency. Assumed $VF^C_{max}=1.8GHz$, $VF^G_{max}=587MHz$, and $\gamma=3.36$.
  • ...and 1 more figures