Table of Contents
Fetching ...

Deep Joint Source-Channel Coding for Adaptive Image Transmission over MIMO Channels

Haotian Wu, Yulin Shao, Chenghong Bian, Krystian Mikolajczyk, Deniz Gündüz

TL;DR

This work addresses the challenge of efficient, robust image transmission over MIMO channels under practical constraints by proposing DeepJSCC-MIMO, a vision-transformer-based joint source-channel coding scheme. It unifies open-loop CSIR and closed-loop CSIT scenarios through a channel heatmap and self-attention, leveraging SVD-based precoding and MIMO equalization within an end-to-end learnable framework. The approach delivers significant PSNR and perceptual gains over separation-based baselines across a wide range of SNRs, bandwidth ratios, and antenna configurations, and remains robust to channel estimation errors and higher-resolution datasets. The method reduces retraining needs, maintains efficiency on GPUs, and provides interpretable channel-adaptive behavior, making it well-suited for emerging semantic communication systems.

Abstract

This paper introduces a vision transformer (ViT)-based deep joint source and channel coding (DeepJSCC) scheme for wireless image transmission over multiple-input multiple-output (MIMO) channels, denoted as DeepJSCC-MIMO. We consider DeepJSCC-MIMO for adaptive image transmission in both open-loop and closed-loop MIMO systems. The novel DeepJSCC-MIMO architecture surpasses the classical separation-based benchmarks with robustness to channel estimation errors and showcases remarkable flexibility in adapting to diverse channel conditions and antenna numbers without requiring retraining. Specifically, by harnessing the self-attention mechanism of ViT, DeepJSCC-MIMO intelligently learns feature mapping and power allocation strategies tailored to the unique characteristics of the source image and prevailing channel conditions. Extensive numerical experiments validate the significant improvements in transmission quality achieved by DeepJSCC-MIMO for both open-loop and closed-loop MIMO systems across a wide range of scenarios. Moreover, DeepJSCC-MIMO exhibits robustness to varying channel conditions, channel estimation errors, and different antenna numbers, making it an appealing solution for emerging semantic communication systems.

Deep Joint Source-Channel Coding for Adaptive Image Transmission over MIMO Channels

TL;DR

This work addresses the challenge of efficient, robust image transmission over MIMO channels under practical constraints by proposing DeepJSCC-MIMO, a vision-transformer-based joint source-channel coding scheme. It unifies open-loop CSIR and closed-loop CSIT scenarios through a channel heatmap and self-attention, leveraging SVD-based precoding and MIMO equalization within an end-to-end learnable framework. The approach delivers significant PSNR and perceptual gains over separation-based baselines across a wide range of SNRs, bandwidth ratios, and antenna configurations, and remains robust to channel estimation errors and higher-resolution datasets. The method reduces retraining needs, maintains efficiency on GPUs, and provides interpretable channel-adaptive behavior, making it well-suited for emerging semantic communication systems.

Abstract

This paper introduces a vision transformer (ViT)-based deep joint source and channel coding (DeepJSCC) scheme for wireless image transmission over multiple-input multiple-output (MIMO) channels, denoted as DeepJSCC-MIMO. We consider DeepJSCC-MIMO for adaptive image transmission in both open-loop and closed-loop MIMO systems. The novel DeepJSCC-MIMO architecture surpasses the classical separation-based benchmarks with robustness to channel estimation errors and showcases remarkable flexibility in adapting to diverse channel conditions and antenna numbers without requiring retraining. Specifically, by harnessing the self-attention mechanism of ViT, DeepJSCC-MIMO intelligently learns feature mapping and power allocation strategies tailored to the unique characteristics of the source image and prevailing channel conditions. Extensive numerical experiments validate the significant improvements in transmission quality achieved by DeepJSCC-MIMO for both open-loop and closed-loop MIMO systems across a wide range of scenarios. Moreover, DeepJSCC-MIMO exhibits robustness to varying channel conditions, channel estimation errors, and different antenna numbers, making it an appealing solution for emerging semantic communication systems.
Paper Structure (37 sections, 33 equations, 16 figures, 13 tables, 1 algorithm)

This paper contains 37 sections, 33 equations, 16 figures, 13 tables, 1 algorithm.

Figures (16)

  • Figure 1: Block diagram of the MIMO image transmission system: (a) conventional separate source-channel coding scheme and (b) DeepJSCC scheme, where the gray blocks with dashed lines are the additional operations for the closed-loop MIMO system.
  • Figure 2: The pipeline of the DeepJSCC-MIMO scheme, where the source image $\bm{S}$ is encoded by a ViT-encoder and reconstructed by a ViT-decoder as $\bm{\hat{S}}$. Precoding operation in dashed line will be performed if the CSI is available at the transmitter, and the CSI ($\bm{H},\sigma_w^2$) is fed to the encoder, if available, and decoder in the form of a "heatmap" $\bm{M}$ to facilitate the JSCC encoding/decoding process.
  • Figure 3: The architecture of the ViT-based encoder and decoder, where both encoder and decoder comprise linear projection layers, a positional embedding layer, and multiple transformer layers.
  • Figure 4: Structure of DL-aided channel equalization method, where the input at the $i$-th channel use is the concatenation of all elements within $\bm{H}$ and the channel output
  • Figure 5: Performance comparisons between the proposed DeepJSCC-MIMO and BPG-Capacity over different SNR values and bandwidth ratios for the MIMO systems with CSIR.
  • ...and 11 more figures