Table of Contents
Fetching ...

Deep Semantic Inference over the Air: An Efficient Task-Oriented Communication System

Chenyang Wang, Roger Olsson, Stefan Forsström, Qing He

TL;DR

This paper tackles efficient wireless classification under latency and resource constraints by proposing a deep learning–driven task-oriented communication framework that partitions neural networks between edge and cloud, and compresses intermediate semantic features. It uses ResNet-18/34 backbones on CIFAR-10/100 with AWGN channel simulation to study how partition location and semantic dimension affect accuracy and latency, revealing that substantial computational and communicational savings can be achieved while preserving a large portion of baseline performance. Key contributions include a tunable latency model, an analysis of early-to-late split points, and practical guidance on balancing edge computation with channel transmission, along with a public codebase. The findings have practical implications for edge–cloud inference in bandwidth- and energy-constrained wireless environments, and pave the way for dynamic, adaptive semantic communication in real-world deployments.

Abstract

Empowered by deep learning, semantic communication marks a paradigm shift from transmitting raw data to conveying task-relevant meaning, enabling more efficient and intelligent wireless systems. In this study, we explore a deep learning-based task-oriented communication framework that jointly considers classification performance, computational latency, and communication cost. We evaluate ResNets-based models on the CIFAR-10 and CIFAR-100 datasets to simulate real-world classification tasks in wireless environments. We partition the model at various points to simulate split inference across a wireless channel. By varying the split location and the size of the transmitted semantic feature vector, we systematically analyze the trade-offs between task accuracy and resource efficiency. Experimental results show that, with appropriate model partitioning and semantic feature compression, the system can retain over 85\% of baseline accuracy while significantly reducing both computational load and communication overhead.

Deep Semantic Inference over the Air: An Efficient Task-Oriented Communication System

TL;DR

This paper tackles efficient wireless classification under latency and resource constraints by proposing a deep learning–driven task-oriented communication framework that partitions neural networks between edge and cloud, and compresses intermediate semantic features. It uses ResNet-18/34 backbones on CIFAR-10/100 with AWGN channel simulation to study how partition location and semantic dimension affect accuracy and latency, revealing that substantial computational and communicational savings can be achieved while preserving a large portion of baseline performance. Key contributions include a tunable latency model, an analysis of early-to-late split points, and practical guidance on balancing edge computation with channel transmission, along with a public codebase. The findings have practical implications for edge–cloud inference in bandwidth- and energy-constrained wireless environments, and pave the way for dynamic, adaptive semantic communication in real-world deployments.

Abstract

Empowered by deep learning, semantic communication marks a paradigm shift from transmitting raw data to conveying task-relevant meaning, enabling more efficient and intelligent wireless systems. In this study, we explore a deep learning-based task-oriented communication framework that jointly considers classification performance, computational latency, and communication cost. We evaluate ResNets-based models on the CIFAR-10 and CIFAR-100 datasets to simulate real-world classification tasks in wireless environments. We partition the model at various points to simulate split inference across a wireless channel. By varying the split location and the size of the transmitted semantic feature vector, we systematically analyze the trade-offs between task accuracy and resource efficiency. Experimental results show that, with appropriate model partitioning and semantic feature compression, the system can retain over 85\% of baseline accuracy while significantly reducing both computational load and communication overhead.

Paper Structure

This paper contains 14 sections, 6 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: A deep learning-based task-oriented communication network.
  • Figure 2: Illustration of ResNet-34 architecture and a split-inference task-oriented network based on the ResNet-34 Vanilla model (unmodified baseline version). Specifically, the first three parts (input, conv1, and conv2_x) are considered in the transmitter (encoder) as pre-processing, while the rest of them are considered in the receiver (decoder) as post-processing. The compression and decompression modules ensure the post-processing input is consistent with the pre-processing output while preserving invariance and equivariance to translation. The semantic representation with $z$ dimension is transmitted from the encoder to the decoder over the AWGN channel. The FLOPs and proportion of each module are listed below. (Zoomed in for better visualization.)
  • Figure 3: Comparison of Top-1 Accuracy on CIFAR-10 and CIFAR-100 using ResNet-18 (Base + Split (SP-2) + AWGN) and ResNet-34 (Base + Split (SP-2) + AWGN) across different $z$ dimensions (implemented by varying the number of channels $N_c$) and SNR levels. Note that the names of the models above and the results on CIFAR-100 for SNR = 5 dB and $N_c=1024$ are consistent with those in Table \ref{['vanilla']}.
  • Figure 4: Comparison of Top-1 Accuracy on CIFAR-100 using ResNet-34 across different $z$ dimensions, SNR levels, and split points. The results for SNR = 5 dB and $N_c=1024$ are consistent with those in Table \ref{['flops']}.
  • Figure 5: Normalized $T_{\mathrm{comp}}$ as a function of $\log_{10}(\beta)$.
  • ...and 1 more figures