Table of Contents
Fetching ...

Learnable Residual-Based Latent Denoising in Semantic Communication

Mingkai Xu, Yongpeng Wu, Yuxuan Shi, Xiang-Gen Xia, Wenjun Zhang, Ping Zhang

TL;DR

This work tackles robust image transmission over noisy channels by introducing a latent denoising semantic communication framework. It combines a Swin Transformer–based JSCC codec with a learnable residual latent denoiser that iteratively removes channel noise, guided by a cosine-similarity based similarity score (SS) to adapt denoising steps and latency. The training scheme includes end-to-end JSCC optimization, insertion of the latent denoiser, and joint finetuning, yielding significant PSNR and MS-SSIM gains over SSCC and diffusion-model baselines, especially at ultra-low SNRs. Simulations confirm effective noise removal and high-quality reconstructions with manageable latency, highlighting practical impact for robust SemCom deployments.

Abstract

A latent denoising semantic communication (SemCom) framework is proposed for robust image transmission over noisy channels. By incorporating a learnable latent denoiser into the receiver, the received signals are preprocessed to effectively remove the channel noise and recover the semantic information, thereby enhancing the quality of the decoded images. Specifically, a latent denoising mapping is established by an iterative residual learning approach to improve the denoising efficiency while ensuring stable performance. Moreover, channel signal-to-noise ratio (SNR) is utilized to estimate and predict the latent similarity score (SS) for conditional denoising, where the number of denoising steps is adapted based on the predicted SS sequence, further reducing the communication latency. Finally, simulations demonstrate that the proposed framework can effectively and efficiently remove the channel noise at various levels and reconstruct visual-appealing images.

Learnable Residual-Based Latent Denoising in Semantic Communication

TL;DR

This work tackles robust image transmission over noisy channels by introducing a latent denoising semantic communication framework. It combines a Swin Transformer–based JSCC codec with a learnable residual latent denoiser that iteratively removes channel noise, guided by a cosine-similarity based similarity score (SS) to adapt denoising steps and latency. The training scheme includes end-to-end JSCC optimization, insertion of the latent denoiser, and joint finetuning, yielding significant PSNR and MS-SSIM gains over SSCC and diffusion-model baselines, especially at ultra-low SNRs. Simulations confirm effective noise removal and high-quality reconstructions with manageable latency, highlighting practical impact for robust SemCom deployments.

Abstract

A latent denoising semantic communication (SemCom) framework is proposed for robust image transmission over noisy channels. By incorporating a learnable latent denoiser into the receiver, the received signals are preprocessed to effectively remove the channel noise and recover the semantic information, thereby enhancing the quality of the decoded images. Specifically, a latent denoising mapping is established by an iterative residual learning approach to improve the denoising efficiency while ensuring stable performance. Moreover, channel signal-to-noise ratio (SNR) is utilized to estimate and predict the latent similarity score (SS) for conditional denoising, where the number of denoising steps is adapted based on the predicted SS sequence, further reducing the communication latency. Finally, simulations demonstrate that the proposed framework can effectively and efficiently remove the channel noise at various levels and reconstruct visual-appealing images.

Paper Structure

This paper contains 18 sections, 8 equations, 7 figures, 1 table.

Figures (7)

  • Figure 1: The overall framework of our proposed latent denoising and transmission scheme. In the diagram, the denoising process is iterated $T$ steps.
  • Figure 2: An illustration of our proposed denoising process and its optimization objective.
  • Figure 3: The architectures of the residual predictor and the similarity predictor.
  • Figure 4: Comparison of reconstruction results of our method and the other baselines under AWGN channel. (a) PSNR versus SNR. (b) MS-SSIM versus SNR.
  • Figure 5: Comparisons of reconstruction results with different values of $s_1$.
  • ...and 2 more figures