Tiny Machine Learning: Progress and Futures
Ji Lin, Ligeng Zhu, Wei-Ming Chen, Wei-Chen Wang, Song Han
TL;DR
This paper addresses the challenge of running deep learning on ultra-low-memory microcontrollers by advocating a system-algorithm co-design approach for TinyML. It introduces MCUNet, a joint framework combining TinyNAS for automated tiny-model design and TinyEngine for memory-efficient inference, further extending towards on-device training via Quantization-Aware Scaling, sparse updates, and the Tiny Training Engine. Key contributions include automated search-space optimization, Once-For-All NAS specialization, code-generation-based inference, patch-based scheduling, and a complete training stack that enables on-device learning within tiny SRAM budgets, delivering state-of-the-art ImageNet results on MCUs and practical on-device adaptation capabilities. The work demonstrates substantial gains in memory efficiency and latency, enabling high-accuracy vision tasks and continuous on-device learning, with broad implications for privacy-preserving, low-power AI at the edge.
Abstract
Tiny Machine Learning (TinyML) is a new frontier of machine learning. By squeezing deep learning models into billions of IoT devices and microcontrollers (MCUs), we expand the scope of AI applications and enable ubiquitous intelligence. However, TinyML is challenging due to hardware constraints: the tiny memory resource makes it difficult to hold deep learning models designed for cloud and mobile platforms. There is also limited compiler and inference engine support for bare-metal devices. Therefore, we need to co-design the algorithm and system stack to enable TinyML. In this review, we will first discuss the definition, challenges, and applications of TinyML. We then survey the recent progress in TinyML and deep learning on MCUs. Next, we will introduce MCUNet, showing how we can achieve ImageNet-scale AI applications on IoT devices with system-algorithm co-design. We will further extend the solution from inference to training and introduce tiny on-device training techniques. Finally, we present future directions in this area. Today's large model might be tomorrow's tiny model. The scope of TinyML should evolve and adapt over time.
