Simulating Battery-Powered TinyML Systems Optimised using Reinforcement Learning in Image-Based Anomaly Detection
Jared M. Ping, Ken J. Nixon
TL;DR
This work tackles energy efficiency for battery-powered TinyML image-based anomaly detection by introducing an autonomous reinforcement learning controller that balances on-device retraining with cloud anomaly processing. It adopts a decayed epsilon-greedy Q-learning approach with a compact Q-table suitable for resource-constrained MCUs, enabling edge deployment after cloud pre-training and supporting ongoing adaptation. Through simulated hardware and environmental conditions, the study demonstrates battery-life improvements of up to 22.86% over static optimization and 10.86% over dynamic optimization, and shows a practical 800 B memory footprint enabling scalable deployment. The findings suggest a viable path to real-world, energy-efficient TinyML deployments in domains such as smart agriculture, with future work focusing on physical hardware validation.
Abstract
Advances in Tiny Machine Learning (TinyML) have bolstered the creation of smart industry solutions, including smart agriculture, healthcare and smart cities. Whilst related research contributes to enabling TinyML solutions on constrained hardware, there is a need to amplify real-world applications by optimising energy consumption in battery-powered systems. The work presented extends and contributes to TinyML research by optimising battery-powered image-based anomaly detection Internet of Things (IoT) systems. Whilst previous work in this area has yielded the capabilities of on-device inferencing and training, there has yet to be an investigation into optimising the management of such capabilities using machine learning approaches, such as Reinforcement Learning (RL), to improve the deployment battery life of such systems. Using modelled simulations, the battery life effects of an RL algorithm are benchmarked against static and dynamic optimisation approaches, with the foundation laid for a hardware benchmark to follow. It is shown that using RL within a TinyML-enabled IoT system to optimise the system operations, including cloud anomaly processing and on-device training, yields an improved battery life of 22.86% and 10.86% compared to static and dynamic optimisation approaches respectively. The proposed solution can be deployed to resource-constrained hardware, given its low memory footprint of 800 B, which could be further reduced. This further facilitates the real-world deployment of such systems, including key sectors such as smart agriculture.
