Table of Contents
Fetching ...

WVSC: Wireless Video Semantic Communication with Multi-frame Compensation

Bingyan Xie, Yongpeng Wu, Yuxuan Shi, Biqian Feng, Wenjun Zhang, Jihong Park, Tony Q. S. Quek

TL;DR

This paper tackles the inefficiency of pixel-level wireless video transmission by introducing WVSC, a semantic-communication–driven framework that encodes frames into compact semantic representations. It replaces per-frame motion vectors with a reference semantic frame and uses a multi-frame fusion attention module to refine the current frame, achieving higher bandwidth efficiency without sacrificing quality. The approach demonstrates PSNR gains of about $1$ dB over a DL-based pixel-level baseline and around $2$ dB over traditional SSCC schemes, with robust performance across SNRs, CBRs, and moderate GoP sizes. The work advances practical wireless video systems by integrating semantic coding with deep video coding and cross-frame compensation to reduce communication overhead while preserving perceptual quality.

Abstract

Existing wireless video transmission schemes directly conduct video coding in pixel level, while neglecting the inner semantics contained in videos. In this paper, we propose a wireless video semantic communication framework, abbreviated as WVSC, which integrates the idea of semantic communication into wireless video transmission scenarios. WVSC first encodes original video frames as semantic frames and then conducts video coding based on such compact representations, enabling the video coding in semantic level rather than pixel level. Moreover, to further reduce the communication overhead, a reference semantic frame is introduced to substitute motion vectors of each frame in common video coding methods. At the receiver, multi-frame compensation (MFC) is proposed to produce compensated current semantic frame with a multi-frame fusion attention module. With both the reference frame transmission and MFC, the bandwidth efficiency improves with satisfying video transmission performance. Experimental results verify the performance gain of WVSC over other DL-based methods e.g. DVSC about 1 dB and traditional schemes about 2 dB in terms of PSNR.

WVSC: Wireless Video Semantic Communication with Multi-frame Compensation

TL;DR

This paper tackles the inefficiency of pixel-level wireless video transmission by introducing WVSC, a semantic-communication–driven framework that encodes frames into compact semantic representations. It replaces per-frame motion vectors with a reference semantic frame and uses a multi-frame fusion attention module to refine the current frame, achieving higher bandwidth efficiency without sacrificing quality. The approach demonstrates PSNR gains of about dB over a DL-based pixel-level baseline and around dB over traditional SSCC schemes, with robust performance across SNRs, CBRs, and moderate GoP sizes. The work advances practical wireless video systems by integrating semantic coding with deep video coding and cross-frame compensation to reduce communication overhead while preserving perceptual quality.

Abstract

Existing wireless video transmission schemes directly conduct video coding in pixel level, while neglecting the inner semantics contained in videos. In this paper, we propose a wireless video semantic communication framework, abbreviated as WVSC, which integrates the idea of semantic communication into wireless video transmission scenarios. WVSC first encodes original video frames as semantic frames and then conducts video coding based on such compact representations, enabling the video coding in semantic level rather than pixel level. Moreover, to further reduce the communication overhead, a reference semantic frame is introduced to substitute motion vectors of each frame in common video coding methods. At the receiver, multi-frame compensation (MFC) is proposed to produce compensated current semantic frame with a multi-frame fusion attention module. With both the reference frame transmission and MFC, the bandwidth efficiency improves with satisfying video transmission performance. Experimental results verify the performance gain of WVSC over other DL-based methods e.g. DVSC about 1 dB and traditional schemes about 2 dB in terms of PSNR.

Paper Structure

This paper contains 18 sections, 10 equations, 7 figures.

Figures (7)

  • Figure 1: Different structures of wireless video transmission frameworks. (a) pixel-level wireless video transmission structure. (b) proposed semantic-level wireless video transmission structure.
  • Figure 2: The proposed WVSC framework. The video is transmitted by a series of GoPs, which is divided into reference frame $\mathbf{x}^\mathrm{ref}$ and current frame $\mathbf{x}^i$. $\mathbf{x}^\mathrm{ref}$ is directly coded and transmitted through semantic coding while $\mathbf{x}^i$ is compressed by both semantic coding and video coding aided by $\mathbf{x}^\mathrm{ref}$.
  • Figure 3: The structure of motion estimation $\&$ compensation network.
  • Figure 4: The architecture of MFA module.
  • Figure 5: Quality of the reconstructed images versus the SNRs under Rayleigh fading channels (CBR = 0.04).
  • ...and 2 more figures