Dynamic Semantic Compression for CNN Inference in Multi-access Edge Computing: A Graph Reinforcement Learning-based Autoencoder
Nan Li, Alexandros Iosifidis, Qi Zhang
TL;DR
The paper tackles CNN inference offloading in dynamic multi-access edge computing by introducing semantic compression via AECNN, which prunes channels through a channel-attention mechanism, encodes the remaining data with entropy coding, and decodes via a lightweight feature recovery module to preserve accuracy. To cope with stochastic channel conditions and fluctuating edge resources, it proposes GRL-AECNN, a graph-convolutional, actor-critic framework that learns offloading decisions over a graph representation of the MEC state and action space. The approach decouples the problem into offloading and resource-allocation subproblems, uses a reward function $\Upsilon$ that balances latency and accuracy with a penalty $\psi$, and demonstrates superior performance against DROO-AECNN, GRL-BottleNet++, and GRL-DeepJSCC across dynamic scenarios. The proposed method improves long-term inference accuracy and throughput while maintaining service reliability in the presence of uncertain computation times and imperfect channel state information, underscoring its practical impact for real-time MEC-enabled AI applications.
Abstract
This paper studies the computational offloading of CNN inference in dynamic multi-access edge computing (MEC) networks. To address the uncertainties in communication time and computation resource availability, we propose a novel semantic compression method, autoencoder-based CNN architecture (AECNN), for effective semantic extraction and compression in partial offloading. In the semantic encoder, we introduce a feature compression module based on the channel attention mechanism in CNNs, to compress intermediate data by selecting the most informative features. In the semantic decoder, we design a lightweight decoder to reconstruct the intermediate data through learning from the received compressed data to improve accuracy. To effectively trade-off communication, computation, and inference accuracy, we design a reward function and formulate the offloading problem of CNN inference as a maximization problem with the goal of maximizing the average inference accuracy and throughput over the long term. To address this maximization problem, we propose a graph reinforcement learning-based AECNN (GRL-AECNN) method, which outperforms existing works DROO-AECNN, GRL-BottleNet++ and GRL-DeepJSCC under different dynamic scenarios. This highlights the advantages of GRL-AECNN in offloading decision-making in dynamic MEC.
