Table of Contents
Fetching ...

Communication-Efficient Edge AI: Algorithms and Systems

Yuanming Shi, Kai Yang, Tao Jiang, Jun Zhang, Khaled B. Letaief

TL;DR

A comprehensive survey of the recent developments in various techniques for overcoming key communication challenges in edge AI systems is presented, and communication-efficient techniques are introduced from both algorithmic and system perspectives for training and inference tasks at the network edge.

Abstract

Artificial intelligence (AI) has achieved remarkable breakthroughs in a wide range of fields, ranging from speech processing, image classification to drug discovery. This is driven by the explosive growth of data, advances in machine learning (especially deep learning), and easy access to vastly powerful computing resources. Particularly, the wide scale deployment of edge devices (e.g., IoT devices) generates an unprecedented scale of data, which provides the opportunity to derive accurate models and develop various intelligent applications at the network edge. However, such enormous data cannot all be sent from end devices to the cloud for processing, due to the varying channel quality, traffic congestion and/or privacy concerns. By pushing inference and training processes of AI models to edge nodes, edge AI has emerged as a promising alternative. AI at the edge requires close cooperation among edge devices, such as smart phones and smart vehicles, and edge servers at the wireless access points and base stations, which however result in heavy communication overheads. In this paper, we present a comprehensive survey of the recent developments in various techniques for overcoming these communication challenges. Specifically, we first identify key communication challenges in edge AI systems. We then introduce communication-efficient techniques, from both algorithmic and system perspectives for training and inference tasks at the network edge. Potential future research directions are also highlighted.

Communication-Efficient Edge AI: Algorithms and Systems

TL;DR

A comprehensive survey of the recent developments in various techniques for overcoming key communication challenges in edge AI systems is presented, and communication-efficient techniques are introduced from both algorithmic and system perspectives for training and inference tasks at the network edge.

Abstract

Artificial intelligence (AI) has achieved remarkable breakthroughs in a wide range of fields, ranging from speech processing, image classification to drug discovery. This is driven by the explosive growth of data, advances in machine learning (especially deep learning), and easy access to vastly powerful computing resources. Particularly, the wide scale deployment of edge devices (e.g., IoT devices) generates an unprecedented scale of data, which provides the opportunity to derive accurate models and develop various intelligent applications at the network edge. However, such enormous data cannot all be sent from end devices to the cloud for processing, due to the varying channel quality, traffic congestion and/or privacy concerns. By pushing inference and training processes of AI models to edge nodes, edge AI has emerged as a promising alternative. AI at the edge requires close cooperation among edge devices, such as smart phones and smart vehicles, and edge servers at the wireless access points and base stations, which however result in heavy communication overheads. In this paper, we present a comprehensive survey of the recent developments in various techniques for overcoming these communication challenges. Specifically, we first identify key communication challenges in edge AI systems. We then introduce communication-efficient techniques, from both algorithmic and system perspectives for training and inference tasks at the network edge. Potential future research directions are also highlighted.

Paper Structure

This paper contains 26 sections, 1 equation, 4 figures, 1 table.

Figures (4)

  • Figure 1: Illustration of edge AI including edge training and edge inference.
  • Figure 3: Illustration of different optimization methods for model training. As a typical example, a generalized linear model is trained where each node $i$ has one data instance $\bm{x}_i$. That is, the target is to optimize $\min_{\bm{w}}\frac{1}{K}\sum_{i=1}^{K}l(\bm{w}^{\sf T}\bm{x}_i)$, where $l$ is the loss function. The first-order derivative of function $\ell$ is denoted as $\ell^\prime$. a) Zeroth-order method: only the function value can be evaluated during training nesterov2017random. b) First-order method: gradient descent. c) Second-order method: DANE shamir2014communication. d) Federated optimization: federated averaging algorithm mcmahan2017communication.
  • Figure 6: Computation offloading based edge inference systems.
  • Figure 7: MapReduce computation model Yuanming_WDCTSP18.