Table of Contents
Fetching ...

AdaOper: Energy-efficient and Responsive Concurrent DNN Inference on Mobile Devices

Zheng Lin, Bin Guo, Sicong Liu, Wentao Zhou, Yasan Ding, Yu Zhang, Zhiwen Yu

TL;DR

AdaOper tackles energy efficiency for long-running DNN inference on mobile devices with heterogeneous processors by challenging the assumption that maximizing parallelism always saves energy. It introduces a runtime energy profiler that fuses offline GBDT-based energy modeling with GRU-based runtime adaptation to provide real-time energy feedback, and an energy-aware operator partitioning module that uses dynamic programming to adapt partitions in response to energy conditions. The system demonstrates up to 16.88% energy savings and up to 12.97% latency reduction on real hardware compared to CoDL under varying workloads, validating the effectiveness of energy-aware cross-processor partitioning in mobile DNN inference. This work enables more battery-efficient and responsive mobile AI applications by continuously aligning computation partitioning with dynamic hardware and workload conditions.

Abstract

Deep neural network (DNN) has driven extensive applications in mobile technology. However, for long-running mobile apps like voice assistants or video applications on smartphones, energy efficiency is critical for battery-powered devices. The rise of heterogeneous processors in mobile devices today has introduced new challenges for optimizing energy efficiency. Our key insight is that partitioning computations across different processors for parallelism and speedup doesn't necessarily correlate with energy consumption optimization and may even increase it. To address this, we present AdaOper, an energy-efficient concurrent DNN inference system. It optimizes energy efficiency on mobile heterogeneous processors while maintaining responsiveness. AdaOper includes a runtime energy profiler that dynamically adjusts operator partitioning to optimize energy efficiency based on dynamic device conditions. We conduct preliminary experiments, which show that AdaOper reduces energy consumption by 16.88% compared to the existing concurrent method while ensuring real-time performance.

AdaOper: Energy-efficient and Responsive Concurrent DNN Inference on Mobile Devices

TL;DR

AdaOper tackles energy efficiency for long-running DNN inference on mobile devices with heterogeneous processors by challenging the assumption that maximizing parallelism always saves energy. It introduces a runtime energy profiler that fuses offline GBDT-based energy modeling with GRU-based runtime adaptation to provide real-time energy feedback, and an energy-aware operator partitioning module that uses dynamic programming to adapt partitions in response to energy conditions. The system demonstrates up to 16.88% energy savings and up to 12.97% latency reduction on real hardware compared to CoDL under varying workloads, validating the effectiveness of energy-aware cross-processor partitioning in mobile DNN inference. This work enables more battery-efficient and responsive mobile AI applications by continuously aligning computation partitioning with dynamic hardware and workload conditions.

Abstract

Deep neural network (DNN) has driven extensive applications in mobile technology. However, for long-running mobile apps like voice assistants or video applications on smartphones, energy efficiency is critical for battery-powered devices. The rise of heterogeneous processors in mobile devices today has introduced new challenges for optimizing energy efficiency. Our key insight is that partitioning computations across different processors for parallelism and speedup doesn't necessarily correlate with energy consumption optimization and may even increase it. To address this, we present AdaOper, an energy-efficient concurrent DNN inference system. It optimizes energy efficiency on mobile heterogeneous processors while maintaining responsiveness. AdaOper includes a runtime energy profiler that dynamically adjusts operator partitioning to optimize energy efficiency based on dynamic device conditions. We conduct preliminary experiments, which show that AdaOper reduces energy consumption by 16.88% compared to the existing concurrent method while ensuring real-time performance.
Paper Structure (6 sections, 2 figures)

This paper contains 6 sections, 2 figures.

Figures (2)

  • Figure 1: Workflow of AdaOper.
  • Figure 2: Performance comparison under different workload conditions.