E4: Energy-Efficient DNN Inference for Edge Video Analytics Via Early-Exit and DVFS

Ziyang Zhang; Yang Zhao; Ming-Ching Chang; Changyao Lin; Jie Liu

E4: Energy-Efficient DNN Inference for Edge Video Analytics Via Early-Exit and DVFS

Ziyang Zhang, Yang Zhao, Ming-Ching Chang, Changyao Lin, Jie Liu

TL;DR

E4 tackles the energy efficiency bottleneck of DNN inference for edge video analytics by coupling an attention-guided early-exit mechanism with a Just-In-Time (JIT) profiler that uses coordinate descent to co-optimize per-layer CPU/GPU frequencies ahead of exit points. The approach accounts for varying video frame complexity and DNN model diversity, enabling dynamic, frame-aware exit decisions and fine-grained power management. Across five edge devices and two DNN backbones on ActivityNet-v1.3 and Mini-Kinetics, E4 achieves up to 2.8× speedup and up to 26% average energy savings while maintaining accuracy, with notable memory reductions and stronger gains on higher-end hardware. This work demonstrates that integrating frame-aware early exits with fine-grained DVFS, guided by an efficient JIT profiler, can substantially improve practical edge inference for video analytics.

Abstract

Deep neural network (DNN) models are increasingly popular in edge video analytic applications. However, the compute-intensive nature of DNN models pose challenges for energy-efficient inference on resource-constrained edge devices. Most existing solutions focus on optimizing DNN inference latency and accuracy, often overlooking energy efficiency. They also fail to account for the varying complexity of video frames, leading to sub-optimal performance in edge video analytics. In this paper, we propose an Energy-Efficient Early-Exit (E4) framework that enhances DNN inference efficiency for edge video analytics by integrating a novel early-exit mechanism with dynamic voltage and frequency scaling (DVFS) governors. It employs an attention-based cascade module to analyze video frame diversity and automatically determine optimal DNN exit points. Additionally, E4 features a just-in-time (JIT) profiler that uses coordinate descent search to co-optimize CPU and GPU clock frequencies for each layer before the DNN exit points. Extensive evaluations demonstrate that E4 outperforms current state-of-the-art methods, achieving up to 2.8x speedup and 26% average energy saving while maintaining high accuracy.

E4: Energy-Efficient DNN Inference for Edge Video Analytics Via Early-Exit and DVFS

TL;DR

Abstract

E4: Energy-Efficient DNN Inference for Edge Video Analytics Via Early-Exit and DVFS

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (5)