Table of Contents
Fetching ...

Dynamic Semantic Compression for CNN Inference in Multi-access Edge Computing: A Graph Reinforcement Learning-based Autoencoder

Nan Li, Alexandros Iosifidis, Qi Zhang

TL;DR

The paper tackles CNN inference offloading in dynamic multi-access edge computing by introducing semantic compression via AECNN, which prunes channels through a channel-attention mechanism, encodes the remaining data with entropy coding, and decodes via a lightweight feature recovery module to preserve accuracy. To cope with stochastic channel conditions and fluctuating edge resources, it proposes GRL-AECNN, a graph-convolutional, actor-critic framework that learns offloading decisions over a graph representation of the MEC state and action space. The approach decouples the problem into offloading and resource-allocation subproblems, uses a reward function $\Upsilon$ that balances latency and accuracy with a penalty $\psi$, and demonstrates superior performance against DROO-AECNN, GRL-BottleNet++, and GRL-DeepJSCC across dynamic scenarios. The proposed method improves long-term inference accuracy and throughput while maintaining service reliability in the presence of uncertain computation times and imperfect channel state information, underscoring its practical impact for real-time MEC-enabled AI applications.

Abstract

This paper studies the computational offloading of CNN inference in dynamic multi-access edge computing (MEC) networks. To address the uncertainties in communication time and computation resource availability, we propose a novel semantic compression method, autoencoder-based CNN architecture (AECNN), for effective semantic extraction and compression in partial offloading. In the semantic encoder, we introduce a feature compression module based on the channel attention mechanism in CNNs, to compress intermediate data by selecting the most informative features. In the semantic decoder, we design a lightweight decoder to reconstruct the intermediate data through learning from the received compressed data to improve accuracy. To effectively trade-off communication, computation, and inference accuracy, we design a reward function and formulate the offloading problem of CNN inference as a maximization problem with the goal of maximizing the average inference accuracy and throughput over the long term. To address this maximization problem, we propose a graph reinforcement learning-based AECNN (GRL-AECNN) method, which outperforms existing works DROO-AECNN, GRL-BottleNet++ and GRL-DeepJSCC under different dynamic scenarios. This highlights the advantages of GRL-AECNN in offloading decision-making in dynamic MEC.

Dynamic Semantic Compression for CNN Inference in Multi-access Edge Computing: A Graph Reinforcement Learning-based Autoencoder

TL;DR

The paper tackles CNN inference offloading in dynamic multi-access edge computing by introducing semantic compression via AECNN, which prunes channels through a channel-attention mechanism, encodes the remaining data with entropy coding, and decodes via a lightweight feature recovery module to preserve accuracy. To cope with stochastic channel conditions and fluctuating edge resources, it proposes GRL-AECNN, a graph-convolutional, actor-critic framework that learns offloading decisions over a graph representation of the MEC state and action space. The approach decouples the problem into offloading and resource-allocation subproblems, uses a reward function that balances latency and accuracy with a penalty , and demonstrates superior performance against DROO-AECNN, GRL-BottleNet++, and GRL-DeepJSCC across dynamic scenarios. The proposed method improves long-term inference accuracy and throughput while maintaining service reliability in the presence of uncertain computation times and imperfect channel state information, underscoring its practical impact for real-time MEC-enabled AI applications.

Abstract

This paper studies the computational offloading of CNN inference in dynamic multi-access edge computing (MEC) networks. To address the uncertainties in communication time and computation resource availability, we propose a novel semantic compression method, autoencoder-based CNN architecture (AECNN), for effective semantic extraction and compression in partial offloading. In the semantic encoder, we introduce a feature compression module based on the channel attention mechanism in CNNs, to compress intermediate data by selecting the most informative features. In the semantic decoder, we design a lightweight decoder to reconstruct the intermediate data through learning from the received compressed data to improve accuracy. To effectively trade-off communication, computation, and inference accuracy, we design a reward function and formulate the offloading problem of CNN inference as a maximization problem with the goal of maximizing the average inference accuracy and throughput over the long term. To address this maximization problem, we propose a graph reinforcement learning-based AECNN (GRL-AECNN) method, which outperforms existing works DROO-AECNN, GRL-BottleNet++ and GRL-DeepJSCC under different dynamic scenarios. This highlights the advantages of GRL-AECNN in offloading decision-making in dynamic MEC.
Paper Structure (29 sections, 1 theorem, 32 equations, 11 figures, 2 tables, 2 algorithms)

This paper contains 29 sections, 1 theorem, 32 equations, 11 figures, 2 tables, 2 algorithms.

Key Result

Theorem 1

Let $\psi(x) \triangleq 2\left(1 - {\textit{sigmoid}}\left(\frac{5x}{\sigma_u^k}\right)\right)$. For any completion time $t_{u,s,k}$ and latency requirement $\sigma_u^k$, we have:

Figures (11)

  • Figure 1: The 1st CL's output feature maps in ResNet-50. The blue one is almost useless for inference, while the red one has enough information to be used to generate the rest.
  • Figure 2: An example of task offloading in an MEC network.
  • Figure 3: The proposed AECNN architecture in device-edge co-inference system. (a) depicts the overall framework of AECNN; (b) shows the design of FC module; and (c) displays the designed FR module by using a CNN with group covolutional layers.
  • Figure 4: Framework of graph reinforcement learning-based AECNN
  • Figure 5: Attention weights of splitting point $l=1$.
  • ...and 6 more figures

Theorems & Definitions (1)

  • Theorem 1