Joint Coding-Modulation for Digital Semantic Communications via Variational Autoencoder

Yufei Bo; Yiheng Duan; Shuo Shao; Meixia Tao

Joint Coding-Modulation for Digital Semantic Communications via Variational Autoencoder

Yufei Bo, Yiheng Duan, Shuo Shao, Meixia Tao

TL;DR

The paper tackles digital semantic communications by integrating coding and modulation within a variational autoencoder framework, learning a probabilistic transition from source data to discrete constellation symbols to enable differentiable, channel-aware training. It introduces a mutual-information–based objective and derives a tractable variational lower bound, enabling joint optimization of an encoder-modulator and two decoders under AWGN channel conditions. Empirical results on CIFAR-10 and TinyImagenet show that Joint Coding-Modulation (JCM) consistently outperforms quantization-based digital baselines across SNRs, rates, and modulation orders, while higher-order modulation narrows the gap to analog semantic communication and induces probabilistic shaping behavior. The work demonstrates practical viability of digital semantic transmission with probabilistic constellation generation and points toward constellation-geometry shaping as a future improvement.

Abstract

Semantic communications have emerged as a new paradigm for improving communication efficiency by transmitting the semantic information of a source message that is most relevant to a desired task at the receiver. Most existing approaches typically utilize neural networks (NNs) to design end-to-end semantic communication systems, where NN-based semantic encoders output continuously distributed signals to be sent directly to the channel in an analog fashion. In this work, we propose a joint coding-modulation (JCM) framework for digital semantic communications by using variational autoencoder (VAE). Our approach learns the transition probability from source data to discrete constellation symbols, thereby avoiding the non-differentiability problem of digital modulation. Meanwhile, by jointly designing the coding and modulation process together, we can match the obtained modulation strategy with the operating channel condition. We also derive a matching loss function with information-theoretic meaning for end-to-end training. Experiments on image semantic communication validate the superiority of our proposed JCM framework over the state-of-the-art quantization-based digital semantic coding-modulation methods across a wide range of channel conditions, transmission rates, and modulation orders. Furthermore, its performance gap to analog semantic communication reduces as the modulation order increases while enjoying the hardware implementation convenience.

Joint Coding-Modulation for Digital Semantic Communications via Variational Autoencoder

TL;DR

Abstract

Paper Structure (22 sections, 5 theorems, 29 equations, 9 figures, 3 tables)

This paper contains 22 sections, 5 theorems, 29 equations, 9 figures, 3 tables.

Introduction
Proposed JCM Framework
System Model
Objective Function
Variational Learning of The JCM Framework
Loss Function Design Based on Variational Inference Lower Bound
Transition Probability Model of The Encoder-Modulator
Loss Function for Image Semantic Communications
Differentiable Constellation Symbol Generation
Experiment Results
Experiment Settings
Datasets
Neural Network Architecture And Hyper-parameters
Benchmarks
Performances against Varying SNRs
...and 7 more sections

Key Result

Theorem 1

A variational inference lower bound of MI-OBJ is given by elbo, shown at the top of the next page, where $K=H(\mathbf{S})+\lambda\cdot H(\mathbf{X})$ is a constant.

Figures (9)

Figure 1: The proposed JCM system model.
Figure 2: Performances of the classification accuracy and the image recovery with varying channel SNRs and three different modulation schemes: BPSK, 4QAM and 16QAM. The number of channel uses is set at 128.
Figure 3: PSNR vs. SNR on Tiny Imagenet with 1024 channel uses. The modulation schemes are set as 16QAM and 64QAM.
Figure 4: Performance comparison with varying transmission rates in 16QAM. SNR is set at 0 dB.
Figure 5: Examples of recovered images at SNR $=$ 0 dB with 1024 channel uses. The modulation scheme is set as 16QAM.
...and 4 more figures

Theorems & Definitions (5)

Theorem 1: VILB
Proposition 1: Transition probability of BPSK
Proposition 2: Transition probability of M-QAM
Corollary 1: Loss function for image semantic communications
Lemma 1

Joint Coding-Modulation for Digital Semantic Communications via Variational Autoencoder

TL;DR

Abstract

Joint Coding-Modulation for Digital Semantic Communications via Variational Autoencoder

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (9)

Theorems & Definitions (5)