Table of Contents
Fetching ...

Energy-Aware Multi-Exit TinyML for Smart Zero-Energy Devices

Shahab Jahanbazi, Mateen Ashraf, Lieven De Strycker, Jeroen Famaey, Onel L. A. Lopez

TL;DR

This paper design, train, and deploys a tiny machine learning model for person detection on a ZED that stores a single model in memory while enabling adaptive inference through multiple exit points, allowing computational effort to scale with input difficulty.

Abstract

The proliferation of smart and autonomous systems has motivated a shift toward executing intelligence directly on edge devices. This shift becomes particularly challenging for zero-energy devices (ZEDs), where severe constraints on memory, energy availability, and inference accuracy must be addressed simultaneously. In this paper, we present a unified approach to managing these constraints for smart ZEDs. Specifically, we design, train, and deploy a tiny machine learning (TinyML) model for person detection on a ZED. The proposed architecture stores a single model in memory while enabling adaptive inference through multiple exit points, allowing computational effort to scale with input difficulty. As a result, low-energy inference is performed for easy instances, while higher-precision inference is selectively employed for harder cases. This strategy significantly reduces energy consumption without sacrificing detection accuracy. Furthermore, to enhance device autonomy and prevent power failures, we introduce auxiliary energy-aware circuits that dynamically regulate system operation based on available energy. Compared with a state-of-the-art energy-aware single-exit TinyML approach, the proposed method achieves an energy consumption reduction of approximately $29.6\%$. Overall, the proposed framework is appealing for enabling accurate and energy-efficient intelligence on ZED platforms.

Energy-Aware Multi-Exit TinyML for Smart Zero-Energy Devices

TL;DR

This paper design, train, and deploys a tiny machine learning model for person detection on a ZED that stores a single model in memory while enabling adaptive inference through multiple exit points, allowing computational effort to scale with input difficulty.

Abstract

The proliferation of smart and autonomous systems has motivated a shift toward executing intelligence directly on edge devices. This shift becomes particularly challenging for zero-energy devices (ZEDs), where severe constraints on memory, energy availability, and inference accuracy must be addressed simultaneously. In this paper, we present a unified approach to managing these constraints for smart ZEDs. Specifically, we design, train, and deploy a tiny machine learning (TinyML) model for person detection on a ZED. The proposed architecture stores a single model in memory while enabling adaptive inference through multiple exit points, allowing computational effort to scale with input difficulty. As a result, low-energy inference is performed for easy instances, while higher-precision inference is selectively employed for harder cases. This strategy significantly reduces energy consumption without sacrificing detection accuracy. Furthermore, to enhance device autonomy and prevent power failures, we introduce auxiliary energy-aware circuits that dynamically regulate system operation based on available energy. Compared with a state-of-the-art energy-aware single-exit TinyML approach, the proposed method achieves an energy consumption reduction of approximately . Overall, the proposed framework is appealing for enabling accurate and energy-efficient intelligence on ZED platforms.
Paper Structure (30 sections, 14 equations, 15 figures, 5 tables)

This paper contains 30 sections, 14 equations, 15 figures, 5 tables.

Figures (15)

  • Figure 1: Block diagram of the Smart-ZED architecture: ambient light is harvested by a solar cell and conditioned by a PMU to supply an MCU and to charge the storage capacitor. The MCU orchestrates image capture, on-device inference, and result indication under strict energy and memory budgets.
  • Figure 2: Execution rules considered in this work. Adaptable start time rule allows execution to be deferred until a predefined deadline (top), whereas the fixed start time rule attempts to start executing the full pipeline at the beginning of each window (bottom).
  • Figure 3: Multi-exit network based on MobileNet V1 (left) and the outputs $O_1$ and $O_2$ for some instances by passing them through the trained model (right).
  • Figure 4: Basic policies for runtime selection during inference phase over window $k$: (top details) policy (i) and (bottom details) policy (ii).
  • Figure 5: Accuracy at EX1 (left), EX2 (middle) and total accuracy (right) versus various $\gamma_1$ and $\gamma_2$
  • ...and 10 more figures