Table of Contents
Fetching ...

A Contemporary Overview: Trends and Applications of Large Language Models on Mobile Devices

Lianjun Liu, Hongli An, Pengxuan Chen, Longxiang Ye

TL;DR

The paper addresses the challenge of delivering powerful language capabilities on mobile devices with minimal latency and maximal privacy. It surveys the state of on-device LLM deployment, hardware accelerators, and edge-cloud cooperation, emphasizing practical applications and performance implications. Key contributions include a synthesis of application domains (voice, translation, AR, healthcare) and concrete strategies (quantization, distillation, NPUs, MEC) to enable mobile LLMs. The analysis highlights the practical impact of mobile LLMs in enabling offline, privacy-preserving, and context-aware intelligent mobile experiences, and outlines trends toward deeper AR/IoT integration and ultra-low-latency communication.

Abstract

With the rapid development of large language models (LLMs), which possess powerful natural language processing and generation capabilities, LLMs are poised to provide more natural and personalized user experiences. Their deployment on mobile devices is gradually becoming a significant trend in the field of intelligent devices. LLMs have demonstrated tremendous potential in applications such as voice assistants, real-time translation, and intelligent recommendations. Advancements in hardware technologies (such as neural network accelerators) and network infrastructure (such as 5G) have enabled efficient local inference and low-latency intelligent responses on mobile devices. This reduces reliance on cloud computing while enhancing data privacy and security. Developers can easily integrate LLM functionalities through open APIs and SDKs, enabling the creation of more innovative intelligent applications. The widespread use of LLMs not only enhances the intelligence of mobile devices but also fosters the integrated innovation of fields like augmented reality (AR) and the Internet of Things (IoT). This trend is expected to drive the development of the next generation of mobile intelligent applications.

A Contemporary Overview: Trends and Applications of Large Language Models on Mobile Devices

TL;DR

The paper addresses the challenge of delivering powerful language capabilities on mobile devices with minimal latency and maximal privacy. It surveys the state of on-device LLM deployment, hardware accelerators, and edge-cloud cooperation, emphasizing practical applications and performance implications. Key contributions include a synthesis of application domains (voice, translation, AR, healthcare) and concrete strategies (quantization, distillation, NPUs, MEC) to enable mobile LLMs. The analysis highlights the practical impact of mobile LLMs in enabling offline, privacy-preserving, and context-aware intelligent mobile experiences, and outlines trends toward deeper AR/IoT integration and ultra-low-latency communication.

Abstract

With the rapid development of large language models (LLMs), which possess powerful natural language processing and generation capabilities, LLMs are poised to provide more natural and personalized user experiences. Their deployment on mobile devices is gradually becoming a significant trend in the field of intelligent devices. LLMs have demonstrated tremendous potential in applications such as voice assistants, real-time translation, and intelligent recommendations. Advancements in hardware technologies (such as neural network accelerators) and network infrastructure (such as 5G) have enabled efficient local inference and low-latency intelligent responses on mobile devices. This reduces reliance on cloud computing while enhancing data privacy and security. Developers can easily integrate LLM functionalities through open APIs and SDKs, enabling the creation of more innovative intelligent applications. The widespread use of LLMs not only enhances the intelligence of mobile devices but also fosters the integrated innovation of fields like augmented reality (AR) and the Internet of Things (IoT). This trend is expected to drive the development of the next generation of mobile intelligent applications.

Paper Structure

This paper contains 18 sections.