Lightweight Deep Learning for Resource-Constrained Environments: A Survey
Hou-I Liu, Marco Galindo, Hongxia Xie, Lai-Kuan Wong, Hong-Han Shuai, Yung-Hui Li, Wen-Huang Cheng
TL;DR
This survey tackles the challenge of deploying deep learning in resource constrained environments by detailing a triad of approaches: lightweight neural network design, model compression, and hardware acceleration. It surveys a broad spectrum of lightweight CNN and transformer based architectures, analyzes pruning, quantization, KD, and NAS as compression techniques, and reviews hardware accelerators, dataflow, and software libraries for edge deployment. The authors also discuss challenges and future directions in TinyML and edge friendly lightweight LLMs and diffusion models, emphasizing co design between hardware and software. The work provides concrete guidance on selecting architectures and compression strategies for specific hardware and application contexts, bridging design choices from model level to system level and outlining practical paths toward real world edge AI deployment. The insights have practical significance for developers and researchers aiming to implement efficient DL on mobile, embedded, and IoT platforms while balancing accuracy, latency, and energy efficiency.
Abstract
Over the past decade, the dominance of deep learning has prevailed across various domains of artificial intelligence, including natural language processing, computer vision, and biomedical signal processing. While there have been remarkable improvements in model accuracy, deploying these models on lightweight devices, such as mobile phones and microcontrollers, is constrained by limited resources. In this survey, we provide comprehensive design guidance tailored for these devices, detailing the meticulous design of lightweight models, compression methods, and hardware acceleration strategies. The principal goal of this work is to explore methods and concepts for getting around hardware constraints without compromising the model's accuracy. Additionally, we explore two notable paths for lightweight deep learning in the future: deployment techniques for TinyML and Large Language Models. Although these paths undoubtedly have potential, they also present significant challenges, encouraging research into unexplored areas.
