Table of Contents
Fetching ...

SCSC: A Novel Standards-Compatible Semantic Communication Framework for Image Transmission

Xue Han, Yongpeng Wu, Zhen Gao, Biqian Feng, Yuxuan Shi, Deniz Gündüz, Wenjun Zhang

TL;DR

The paper tackles image transmission over MIMO channels by integrating semantic transmission within a standards-compatible JSCC framework, SCSC. It introduces the PPEN and PCEN modules to perform semantic preprocessing and finite-alphabet precoding, respectively, while preserving compatibility with legacy SSCC components and employing a proxy network to enable end-to-end training despite non-differentiable codecs. Empirical results on Cityscapes and CVRG-Pano show up to $29.46\%$ channel bandwidth savings and improvements in PSNR, MS-SSIM, and mIoU across a range of SNRs and coding rates, with strong robustness to unseen datasets and tasks. The approach maintains modularity and practical deployability on existing hardware, suggesting significant impact for real-world semantic communications in future wireless networks.

Abstract

Joint source-channel coding (JSCC) is a promising paradigm for next-generation communication systems, particularly in challenging transmission environments. In this paper, we propose a novel standard-compatible JSCC framework for the transmission of images over multiple-input multiple-output (MIMO) channels. Different from the existing end-to-end AI-based DeepJSCC schemes, our framework consists of learnable modules that enable communication using conventional separate source and channel codes (SSCC), which makes it amenable for easy deployment on legacy systems. Specifically, the learnable modules involve a preprocessing-empowered network (PPEN) for preserving essential semantic information, and a precoder \& combiner-enhanced network (PCEN) for efficient transmission over a resource-constrained MIMO channel. We treat existing compression and channel coding modules as non-trainable blocks. Since the parameters of these modules are non-differentiable, we employ a proxy network that mimics their operations when training the learnable modules. Numerical results demonstrate that our scheme can save more than 29\% of the channel bandwidth, and requires lower complexity compared to the constrained baselines. We also show its generalization capability to unseen datasets and tasks through extensive experiments.

SCSC: A Novel Standards-Compatible Semantic Communication Framework for Image Transmission

TL;DR

The paper tackles image transmission over MIMO channels by integrating semantic transmission within a standards-compatible JSCC framework, SCSC. It introduces the PPEN and PCEN modules to perform semantic preprocessing and finite-alphabet precoding, respectively, while preserving compatibility with legacy SSCC components and employing a proxy network to enable end-to-end training despite non-differentiable codecs. Empirical results on Cityscapes and CVRG-Pano show up to channel bandwidth savings and improvements in PSNR, MS-SSIM, and mIoU across a range of SNRs and coding rates, with strong robustness to unseen datasets and tasks. The approach maintains modularity and practical deployability on existing hardware, suggesting significant impact for real-world semantic communications in future wireless networks.

Abstract

Joint source-channel coding (JSCC) is a promising paradigm for next-generation communication systems, particularly in challenging transmission environments. In this paper, we propose a novel standard-compatible JSCC framework for the transmission of images over multiple-input multiple-output (MIMO) channels. Different from the existing end-to-end AI-based DeepJSCC schemes, our framework consists of learnable modules that enable communication using conventional separate source and channel codes (SSCC), which makes it amenable for easy deployment on legacy systems. Specifically, the learnable modules involve a preprocessing-empowered network (PPEN) for preserving essential semantic information, and a precoder \& combiner-enhanced network (PCEN) for efficient transmission over a resource-constrained MIMO channel. We treat existing compression and channel coding modules as non-trainable blocks. Since the parameters of these modules are non-differentiable, we employ a proxy network that mimics their operations when training the learnable modules. Numerical results demonstrate that our scheme can save more than 29\% of the channel bandwidth, and requires lower complexity compared to the constrained baselines. We also show its generalization capability to unseen datasets and tasks through extensive experiments.
Paper Structure (33 sections, 20 equations, 16 figures, 3 tables, 2 algorithms)

This paper contains 33 sections, 20 equations, 16 figures, 3 tables, 2 algorithms.

Figures (16)

  • Figure 1: (a) Conventional digital communication system. (b) Overview of the proposed SCSC framework for semantic communications.
  • Figure 2: The detailed framework of our PPEN module. The DAC module is used in the last stage of the distortion compensation layer.
  • Figure 3: Illustration of the DAC semantic feature extraction block and the detailed process of deformable convolution.
  • Figure 4: The diagram of the PEN and CEN models with parameters $\bm \theta$ and $\bm \eta$, respectively. The PEN consists of total $T$ iteration rounds and each layer has the same structure, which contains the linear estimator $\mathbf{W}$, nonlinear estimator $\Pi_{\mathcal{M}}$, trainable variables $\gamma^t$, and $\alpha^t$. $\mathbf z_d^*$ is the final output of PEN. The CEN is primarily implemented by the fully connected layer with trainable parameter $\bm \eta$.
  • Figure 5: The structure of the proxy network.
  • ...and 11 more figures