Adaptive Semantic Token Communication for Transformer-based Edge Inference

Alessio Devoto; Jary Pomponi; Mattia Merluzzi; Paolo Di Lorenzo; Simone Scardapane

Adaptive Semantic Token Communication for Transformer-based Edge Inference

Alessio Devoto, Jary Pomponi, Mattia Merluzzi, Paolo Di Lorenzo, Simone Scardapane

TL;DR

This work tackles semantic, goal-oriented edge inference under dynamic bandwidth and channel conditions by introducing a transformer-based DJSCC framework that selectively transmits semantic tokens. It combines adaptive token selection with complex-valued token representations and per-token compression, enabling a single robust model across varying channel qualities. A Lyapunov stochastic optimization-based controller dynamically tunes the token budget $\alpha$ and per-token dimensionality $r$ to balance inference accuracy and communication cost in time-varying networks, supported by an interpretable token-discard mechanism. Experimental results on Imagenette show consistent gains over baselines in accuracy at fixed bandwidth, enhanced robustness to noise, and clear interpretability of the token selection process, highlighting practical potential for AI-native semantic communication in edge intelligence.

Abstract

This paper presents an adaptive framework for edge inference based on a dynamically configurable transformer-powered deep joint source channel coding (DJSCC) architecture. Motivated by a practical scenario where a resource constrained edge device engages in goal oriented semantic communication, such as selectively transmitting essential features for object detection to an edge server, our approach enables efficient task aware data transmission under varying bandwidth and channel conditions. To achieve this, input data is tokenized into compact high level semantic representations, refined by a transformer, and transmitted over noisy wireless channels. As part of the DJSCC pipeline, we employ a semantic token selection mechanism that adaptively compresses informative features into a user specified number of tokens per sample. These tokens are then further compressed through the JSCC module, enabling a flexible token communication strategy that adjusts both the number of transmitted tokens and their embedding dimensions. We incorporate a resource allocation algorithm based on Lyapunov stochastic optimization to enhance robustness under dynamic network conditions, effectively balancing compression efficiency and task performance. Experimental results demonstrate that our system consistently outperforms existing baselines, highlighting its potential as a strong foundation for AI native semantic communication in edge intelligence applications.

Adaptive Semantic Token Communication for Transformer-based Edge Inference

TL;DR

Abstract

Adaptive Semantic Token Communication for Transformer-based Edge Inference

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (9)