Table of Contents
Fetching ...

Energy-Aware Dynamic Neural Inference

Marcello Bullo, Seifallah Jardak, Pietro Carnelli, Deniz Gündüz

TL;DR

This work considers an on-device adaptive inference system equipped with an energy-harvester and finite-capacity energy storage, and derives a principled policy with theoretical guarantees for confidence-aware and -agnostic controllers in multi-exit networks.

Abstract

The growing demand for intelligent applications beyond the network edge, coupled with the need for sustainable operation, are driving the seamless integration of deep learning (DL) algorithms into energy-limited, and even energy-harvesting end-devices. However, the stochastic nature of ambient energy sources often results in insufficient harvesting rates, failing to meet the energy requirements for inference and causing significant performance degradation in energy-agnostic systems. To address this problem, we consider an on-device adaptive inference system equipped with an energy-harvester and finite-capacity energy storage. We then allow the device to reduce the run-time execution cost on-demand, by either switching between differently-sized neural networks, referred to as multi-model selection (MMS), or by enabling earlier predictions at intermediate layers, called early exiting (EE). The model to be employed, or the exit point is then dynamically chosen based on the energy storage and harvesting process states. We also study the efficacy of integrating the prediction confidence into the decision-making process. We derive a principled policy with theoretical guarantees for confidence-aware and -agnostic controllers. Moreover, in multi-exit networks, we study the advantages of taking decisions incrementally, exit-by-exit, by designing a lightweight reinforcement learning-based controller. Experimental results show that, as the rate of the ambient energy increases, energy- and confidence-aware control schemes show approximately 5% improvement in accuracy compared to their energy-aware confidence-agnostic counterparts. Incremental approaches achieve even higher accuracy, particularly when the energy storage capacity is limited relative to the energy consumption of the inference model.

Energy-Aware Dynamic Neural Inference

TL;DR

This work considers an on-device adaptive inference system equipped with an energy-harvester and finite-capacity energy storage, and derives a principled policy with theoretical guarantees for confidence-aware and -agnostic controllers in multi-exit networks.

Abstract

The growing demand for intelligent applications beyond the network edge, coupled with the need for sustainable operation, are driving the seamless integration of deep learning (DL) algorithms into energy-limited, and even energy-harvesting end-devices. However, the stochastic nature of ambient energy sources often results in insufficient harvesting rates, failing to meet the energy requirements for inference and causing significant performance degradation in energy-agnostic systems. To address this problem, we consider an on-device adaptive inference system equipped with an energy-harvester and finite-capacity energy storage. We then allow the device to reduce the run-time execution cost on-demand, by either switching between differently-sized neural networks, referred to as multi-model selection (MMS), or by enabling earlier predictions at intermediate layers, called early exiting (EE). The model to be employed, or the exit point is then dynamically chosen based on the energy storage and harvesting process states. We also study the efficacy of integrating the prediction confidence into the decision-making process. We derive a principled policy with theoretical guarantees for confidence-aware and -agnostic controllers. Moreover, in multi-exit networks, we study the advantages of taking decisions incrementally, exit-by-exit, by designing a lightweight reinforcement learning-based controller. Experimental results show that, as the rate of the ambient energy increases, energy- and confidence-aware control schemes show approximately 5% improvement in accuracy compared to their energy-aware confidence-agnostic counterparts. Incremental approaches achieve even higher accuracy, particularly when the energy storage capacity is limited relative to the energy consumption of the inference model.

Paper Structure

This paper contains 31 sections, 4 theorems, 48 equations, 7 figures, 4 tables, 1 algorithm.

Key Result

Lemma 1

For any input instance ${\textnormal{w}}_{t_n}$, and any specified action $a_{t_n}$, there exists a sequence of $K-1$ sub-actions $\{\alpha_{t_n+\tau}\}_{\tau=0}^{T-1}$ that would corresponds to ${\textnormal{a}}_{t_n}\in\mathcal{A}$. Such correspondence can be expressed in closed-form as Therefore, an optimal policy $\pi_{\text{inc}}^*:\mathcal{S}\times \Lambda\to\mathscr{P}(\Lambda)$ with respe

Figures (7)

  • Figure 1: System diagram of an adaptive inference system for resource-constrained devices. Sensor data are processed by a computing module (e.g., mms or ee where the computation mode (e.g., model or exit selection), is regulated by the controller based on the es level, eh dynamics, and potential feedback information (e.g., prediction confidence). In adaptive dnn, the predictive module represents the sequence of operations converting the output likelihoods into the final prediction. In this scenario, the goal of the controller is to trade-off the prediction accuracy for the processing energy cost.
  • Figure 2: Operational granularity of action selection.
  • Figure 3: Controlled computing modules defined by the granularity of action selection and the availability of feedback information. (a) and (b) depict the control scheme and an the temporal dynamics of an oracle os-iaw controller, respectively. In the absence of feedback, this model reduces to the mms scheme. Conversely, (c) and (d) represent inc controls applicable to multi-exit networks: under causal feedback, the inc-iaw-ee scheme is realized, whereas the absence of feedback yields the inc-iag-ee scheme.
  • Figure 4: Partition $\mathcal{P}_{\bm{s}}$ in the $z^{(1)}$-$z^{(2)}$ induced by the optimal value function with $K=3$ exit classifiers.
  • Figure 5: flop required to process an input instance for each sub-network $f_i(\cdot;\theta^{(i)})$, with $i$ being the exit anchor, of the corresponding EfficientNet model. The inset picture shows the flop for the all EfficientNet models (from B0 to B6).
  • ...and 2 more figures

Theorems & Definitions (7)

  • Lemma 1
  • Theorem 1: Optimal value function
  • Lemma 2: Optimal policy
  • Example 1: os-iaw with Initial Random Predictor
  • Theorem 2: Optimality of monotone policies for mms
  • proof : Proof of Theorem \ref{['th:pwl_vf']}
  • proof : Proof of Theorem \ref{['th:mms_optimal_policy']}