Dynamic Semantic Compression for CNN Inference in Multi-access Edge Computing: A Graph Reinforcement Learning-based Autoencoder

Nan Li; Alexandros Iosifidis; Qi Zhang

Dynamic Semantic Compression for CNN Inference in Multi-access Edge Computing: A Graph Reinforcement Learning-based Autoencoder

Nan Li, Alexandros Iosifidis, Qi Zhang

TL;DR

The paper tackles CNN inference offloading in dynamic multi-access edge computing by introducing semantic compression via AECNN, which prunes channels through a channel-attention mechanism, encodes the remaining data with entropy coding, and decodes via a lightweight feature recovery module to preserve accuracy. To cope with stochastic channel conditions and fluctuating edge resources, it proposes GRL-AECNN, a graph-convolutional, actor-critic framework that learns offloading decisions over a graph representation of the MEC state and action space. The approach decouples the problem into offloading and resource-allocation subproblems, uses a reward function $\Upsilon$ that balances latency and accuracy with a penalty $\psi$, and demonstrates superior performance against DROO-AECNN, GRL-BottleNet++, and GRL-DeepJSCC across dynamic scenarios. The proposed method improves long-term inference accuracy and throughput while maintaining service reliability in the presence of uncertain computation times and imperfect channel state information, underscoring its practical impact for real-time MEC-enabled AI applications.

Abstract

This paper studies the computational offloading of CNN inference in dynamic multi-access edge computing (MEC) networks. To address the uncertainties in communication time and computation resource availability, we propose a novel semantic compression method, autoencoder-based CNN architecture (AECNN), for effective semantic extraction and compression in partial offloading. In the semantic encoder, we introduce a feature compression module based on the channel attention mechanism in CNNs, to compress intermediate data by selecting the most informative features. In the semantic decoder, we design a lightweight decoder to reconstruct the intermediate data through learning from the received compressed data to improve accuracy. To effectively trade-off communication, computation, and inference accuracy, we design a reward function and formulate the offloading problem of CNN inference as a maximization problem with the goal of maximizing the average inference accuracy and throughput over the long term. To address this maximization problem, we propose a graph reinforcement learning-based AECNN (GRL-AECNN) method, which outperforms existing works DROO-AECNN, GRL-BottleNet++ and GRL-DeepJSCC under different dynamic scenarios. This highlights the advantages of GRL-AECNN in offloading decision-making in dynamic MEC.

Dynamic Semantic Compression for CNN Inference in Multi-access Edge Computing: A Graph Reinforcement Learning-based Autoencoder

TL;DR

that balances latency and accuracy with a penalty

, and demonstrates superior performance against DROO-AECNN, GRL-BottleNet++, and GRL-DeepJSCC across dynamic scenarios. The proposed method improves long-term inference accuracy and throughput while maintaining service reliability in the presence of uncertain computation times and imperfect channel state information, underscoring its practical impact for real-time MEC-enabled AI applications.

Abstract

Paper Structure (29 sections, 1 theorem, 32 equations, 11 figures, 2 tables, 2 algorithms)

This paper contains 29 sections, 1 theorem, 32 equations, 11 figures, 2 tables, 2 algorithms.

Introduction
System Model
Task Model
Communication Model
Computation Model
Architecture of AECNN
Overview of AECNN Architecture
Feature compression module
Feature recovery module
Problem Formulation
Task Completion Time and Energy Consumption
Objective
Graph reinforcement learning-based AECNN
GRL-AECNN framework
Actor network
...and 14 more sections

Key Result

Theorem 1

Let $\psi(x) \triangleq 2\left(1 - {\textit{sigmoid}}\left(\frac{5x}{\sigma_u^k}\right)\right)$. For any completion time $t_{u,s,k}$ and latency requirement $\sigma_u^k$, we have:

Figures (11)

Figure 1: The 1st CL's output feature maps in ResNet-50. The blue one is almost useless for inference, while the red one has enough information to be used to generate the rest.
Figure 2: An example of task offloading in an MEC network.
Figure 3: The proposed AECNN architecture in device-edge co-inference system. (a) depicts the overall framework of AECNN; (b) shows the design of FC module; and (c) displays the designed FR module by using a CNN with group covolutional layers.
Figure 4: Framework of graph reinforcement learning-based AECNN
Figure 5: Attention weights of splitting point $l=1$.
...and 6 more figures

Theorems & Definitions (1)

Theorem 1

Dynamic Semantic Compression for CNN Inference in Multi-access Edge Computing: A Graph Reinforcement Learning-based Autoencoder

TL;DR

Abstract

Dynamic Semantic Compression for CNN Inference in Multi-access Edge Computing: A Graph Reinforcement Learning-based Autoencoder

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (11)

Theorems & Definitions (1)