Table of Contents
Fetching ...

Adaptive Wireless Image Semantic Transmission and Over-The-Air Testing

Jiarun Ding, Peiwen Jiang, Chao-Kai Wen, Shi Jin

TL;DR

This work targets semantic-aware image transmission by treating objects and backgrounds with differing importance under bandwidth constraints. It introduces ASCViT-JSCC, a ViT-based JSCC framework that adaptively masks image patches using YOLOv5 and SIFT-derived importance, and transmits via OFDM with quantization to meet digital modulation standards. The system is trained in a two-stage process and validated through extensive simulations and an OTA testbed (ICP) comprising SDRs and NVIDIA kits, showing superior object preservation and reconstruction across AWGN and fading channels compared to baselines. The study demonstrates practical viability and provides a hardware-oriented platform for future intelligent communication research, with robust performance gains in realistic wireless scenarios.

Abstract

Semantic communication has undergone considerable evolution due to the recent rapid development of artificial intelligence (AI), significantly enhancing both communication robustness and efficiency. Despite these advancements, most current semantic communication methods for image transmission pay little attention to the differing importance of objects and backgrounds in images. To address this issue, we propose a novel scheme named ASCViT-JSCC, which utilizes vision transformers (ViTs) integrated with an orthogonal frequency division multiplexing (OFDM) system. This scheme adaptively allocates bandwidth for objects and backgrounds in images according to the importance order of different parts determined by object detection of you only look once version 5 (YOLOv5) and feature points detection of scale invariant feature transform (SIFT). Furthermore, the proposed scheme adheres to digital modulation standards by incorporating quantization modules. We validate this approach through an over-the-air (OTA) testbed named intelligent communication prototype validation platform (ICP) based on a software-defined radio (SDR) and NVIDIA embedded kits. Our findings from both simulations and practical measurements show that ASCViT-JSCC significantly preserves objects in images and enhances reconstruction quality compared to existing methods.

Adaptive Wireless Image Semantic Transmission and Over-The-Air Testing

TL;DR

This work targets semantic-aware image transmission by treating objects and backgrounds with differing importance under bandwidth constraints. It introduces ASCViT-JSCC, a ViT-based JSCC framework that adaptively masks image patches using YOLOv5 and SIFT-derived importance, and transmits via OFDM with quantization to meet digital modulation standards. The system is trained in a two-stage process and validated through extensive simulations and an OTA testbed (ICP) comprising SDRs and NVIDIA kits, showing superior object preservation and reconstruction across AWGN and fading channels compared to baselines. The study demonstrates practical viability and provides a hardware-oriented platform for future intelligent communication research, with robust performance gains in realistic wireless scenarios.

Abstract

Semantic communication has undergone considerable evolution due to the recent rapid development of artificial intelligence (AI), significantly enhancing both communication robustness and efficiency. Despite these advancements, most current semantic communication methods for image transmission pay little attention to the differing importance of objects and backgrounds in images. To address this issue, we propose a novel scheme named ASCViT-JSCC, which utilizes vision transformers (ViTs) integrated with an orthogonal frequency division multiplexing (OFDM) system. This scheme adaptively allocates bandwidth for objects and backgrounds in images according to the importance order of different parts determined by object detection of you only look once version 5 (YOLOv5) and feature points detection of scale invariant feature transform (SIFT). Furthermore, the proposed scheme adheres to digital modulation standards by incorporating quantization modules. We validate this approach through an over-the-air (OTA) testbed named intelligent communication prototype validation platform (ICP) based on a software-defined radio (SDR) and NVIDIA embedded kits. Our findings from both simulations and practical measurements show that ASCViT-JSCC significantly preserves objects in images and enhances reconstruction quality compared to existing methods.
Paper Structure (13 sections, 7 equations, 6 figures, 2 tables)

This paper contains 13 sections, 7 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: The structure of ASCViT-JSCC.
  • Figure 2: The adaptive preprocessing.
  • Figure 3: The structure of ASCViT-JSCC NN. For ViT(a, b, c, d), a, b, c and d denote patch size, embedded dimension, head number and block number, respectively. For CNN (a, b, c, d), a, b, c and d respectively denote input channel number, output channel number, kernel size and stride. The numbers labelled next to the graph are output dimension of networks.
  • Figure 4: Two metrics versus MR. SNR indicates the SNRs of test channels.
  • Figure 5: Performance of ASCViT-JSCC compared to other schemes in AWGN and Rayleigh fading channels. "R" indicates that NNs are trained at ramdom SNRs uniformly sampled from [-5, 15]. "FT" indicates the networks are fine-tuned at test SNRs.
  • ...and 1 more figures