Table of Contents
Fetching ...

Empowering Edge Intelligence: A Comprehensive Survey on On-Device AI Models

Xubin Wang, Zhiqing Tang, Jianxiong Guo, Tianhui Meng, Chenhao Wang, Tian Wang, Weijia Jia

TL;DR

This survey analyzes the shift of AI deployments to edge and terminal devices, highlighting real-time processing, data privacy, and IoT-driven demand. It provides a structured, end-to-end view of on-device AI—from fundamental concepts and device taxonomy to applications, technical challenges, and optimization strategies (data-, model-, and system-level). The work emphasizes practical methods such as model compression, quantization, pruning, knowledge distillation, and hardware acceleration, while discussing energy efficiency, security, and continuous learning in edge contexts. By integrating emerging technologies like 5G, edge computing, and foundation models, the paper outlines future directions for sustainable, adaptive edge intelligence with broad societal and industrial impact.

Abstract

The rapid advancement of artificial intelligence (AI) technologies has led to an increasing deployment of AI models on edge and terminal devices, driven by the proliferation of the Internet of Things (IoT) and the need for real-time data processing. This survey comprehensively explores the current state, technical challenges, and future trends of on-device AI models. We define on-device AI models as those designed to perform local data processing and inference, emphasizing their characteristics such as real-time performance, resource constraints, and enhanced data privacy. The survey is structured around key themes, including the fundamental concepts of AI models, application scenarios across various domains, and the technical challenges faced in edge environments. We also discuss optimization and implementation strategies, such as data preprocessing, model compression, and hardware acceleration, which are essential for effective deployment. Furthermore, we examine the impact of emerging technologies, including edge computing and foundation models, on the evolution of on-device AI models. By providing a structured overview of the challenges, solutions, and future directions, this survey aims to facilitate further research and application of on-device AI, ultimately contributing to the advancement of intelligent systems in everyday life.

Empowering Edge Intelligence: A Comprehensive Survey on On-Device AI Models

TL;DR

This survey analyzes the shift of AI deployments to edge and terminal devices, highlighting real-time processing, data privacy, and IoT-driven demand. It provides a structured, end-to-end view of on-device AI—from fundamental concepts and device taxonomy to applications, technical challenges, and optimization strategies (data-, model-, and system-level). The work emphasizes practical methods such as model compression, quantization, pruning, knowledge distillation, and hardware acceleration, while discussing energy efficiency, security, and continuous learning in edge contexts. By integrating emerging technologies like 5G, edge computing, and foundation models, the paper outlines future directions for sustainable, adaptive edge intelligence with broad societal and industrial impact.

Abstract

The rapid advancement of artificial intelligence (AI) technologies has led to an increasing deployment of AI models on edge and terminal devices, driven by the proliferation of the Internet of Things (IoT) and the need for real-time data processing. This survey comprehensively explores the current state, technical challenges, and future trends of on-device AI models. We define on-device AI models as those designed to perform local data processing and inference, emphasizing their characteristics such as real-time performance, resource constraints, and enhanced data privacy. The survey is structured around key themes, including the fundamental concepts of AI models, application scenarios across various domains, and the technical challenges faced in edge environments. We also discuss optimization and implementation strategies, such as data preprocessing, model compression, and hardware acceleration, which are essential for effective deployment. Furthermore, we examine the impact of emerging technologies, including edge computing and foundation models, on the evolution of on-device AI models. By providing a structured overview of the challenges, solutions, and future directions, this survey aims to facilitate further research and application of on-device AI, ultimately contributing to the advancement of intelligent systems in everyday life.

Paper Structure

This paper contains 84 sections, 5 figures, 10 tables.

Figures (5)

  • Figure 1: Structure of this survey.
  • Figure 2: An overview of how on-device AI works. The figure illustrates a general pipeline encompassing three critical aspects: data, model, and system. It is important to note that not all steps are necessary in practical applications.
  • Figure 3: An overview of data optimization operations for on-device AI—including data filtering, feature extraction, data aggregation, data quantization, and edge computing frameworks—can be employed to enhance the quality of data collected for on-device AI models.
  • Figure 4: An overview of model optimization operations. Model compression involves using various techniques, such as pruning, model quantization, and knowledge distillation, to reduce the size of the model and obtain a compact model that requires fewer resources while maintaining high accuracy. Model design involves creating lightweight models through manual and automated techniques, including architecture selection, parameter tuning, and regularization.
  • Figure 5: An overview of system optimization operations for on-device AI. Software optimization includes frameworks for lightweight model training and inference, while hardware optimization focuses on acceleration methods to improve computational efficiency.